Downloading a backup (without uploads) and restoring to a local instance PostgreSQL, that you can query with pgAdmin3 is your best bet.
Yes, the Data Explorer plug-in does not allow changes to be made. But I’d suggest starting with that anyway. You can run queries to identify if you do have problems and then run queries to select the records that need to be changed. And you will be protected from accidentally wrecking your database.
I’m assuming that you’re running in production already. so I’d be very wary of modifying the database without some sort of approval from the Discousrse team.
Once you are familiar with the Discourse database and you know the size of the problem, then you could consider your options for making the changes. You might find you can do it all in Discourse. Or you might have so few changes that the command line is fine.
I am a serious noob at using anything not involving pointing and clicking.
I have however managed to set up a local instance of Discourse (under windows 10) by following the official guides (without understanding much of anything), so presume somewhere there will be guidance for installing pgAdmin3 to the same place?
Silly question, but is this offline Discourse instance the thing I should be restoring the exported sql file to (or is there some other way to ‘restore’ a Discourse database backup file to PostgreSQL? ).
And how can I query it once restored? i.e. where and how do I point pgAdmin3 to the database in a running Discourse instance? where does a running discourse database physically reside? Does the fact that Discourse instance is running lock the database in some way?
Can’t seem to see any files obviously corresponding to a database amongst the files in my local server ~ /var/discourse directory
I am self hosting Discourse so have no access to the Discourse team
Am setting up and running forum for a small community, purely on voluntary basis (i.e no budget), and am bringing in data from both MyBB and Yahoo groups forums - the latter of which has introduced for example a bunch of old administrative auto generated email notifications as topics in Discourse, which are pretty irrelevant in this context
It is likely I will need to test and make global changes in a sporadic incremental / ongoing basis, as I discover new issues - hence the desire to minimize learning curve and cli based anything!
I saw your domain name and checked you out first time so I somewhat understand your situation. I’m in Wellington, by the way, setting up and actively using a couple of private forums for charities.
Given your limited time and experience I’d suggest starting with the simplest and easiest to use tools first and only moving onto the next level if you are forced. There’s a high cost to mastering anything more than the first level which is just using Discourse itself. It can be incredibly powerful when used as designed.
The Discourse GUI with the features that you will need to master anyway.
Get up to speed with what you need to use in Discourse. You might find that you can do 99% of what you need without going any further. So you might not go any further.
Prioritise actual visible problems.
For example, your old email notifications might not be a big issue. If they are irrelevant then they will, as designed, drop out of any prominence in Discourse as more relevant topics are created. Simply getting your forum going properly is more important than solving every data error.
So, how many of these automated email notifications have been converted to topics?
The Date Explorer plug-in will be useful for you so it is the next step to a better understanding of the database and to identify what might need to be done. Later it will help you with reporting but it is not absolutely necessary.
Not being able to make changes is a safety net because a few years ago there were a lot of users wrecking their data with precipitate updates.
The queries you generate to identify the problem records are only a step away from SQL to modify the database.
The final option is likely to be the command-line and scripts next to modify your database.
I’d rather run the risk of bad data on display than risk damaging the database by manually modifying it. That can create a time bomb because some data corruption might not be identified in a timely manner.
Using Data Explorer will give you query results which are real data examples and quantifiable numbers of records. You are more likely to get a correct answer from the team and other experts if they can understand your data and what you are trying to achieve. They can then advise you about the easiest and safest way to update the database.
Much of what you need is likely to be in topics already because other sites strike the same sort of problems.So you’ll just be copying someone else’s hard won knowledge.
This. Just don’t modify the database directly. You will regret it someday.
All sounds like sensible advice.
I might be able to crowd source a partial solution if I can convince some users to manually scan for and mark topics for deletion.
Right now the site on the domain is essentially a clean slate placeholder while a freelancer is fine tuning the import process from old forums (tweaking the import script for MyBB in particular, so that hopefully the custom user profile fields I had set up there will import along with users and their posts. Also hopefully parsing the MyCode in some MyBB posts properly (right now the formatting codes are visible)
I wasted a week fruitlessly attempting this import myself but just could not achieve a dev environment with all prerequisites in place that would allow me to do so from either a Windows 10 instance of Ubuntu or a DigitalOcean based instance, based on the official guide for use of the MyBB import script.
After painfully thrashing around, solving one error message after the next, I finally struck an absolute brick wall either way when it came to getting the SQL database to be accessible when executing the final ruby on rails command that would initiate the import.
Linux and Ruby both seem to be written by sadists, for masochists, both being astonishingly fragile and arcane. In such an environment, the odds of catastrophic problems when manipulating databases do seem high!
You have my sympathy.
Stick to being forum admin.
User friendliness has always been lacking in that environment. I think that the command line is king even more than it was 20 years ago.
But that’s why I like Discourse. The team make an effort to make core Discourse significantly friendlier. Unfortunately, migrating is a high-tech option.
It’s almost certainly better to do that from the Rails console.
I guess that might depend on your benchmark for ‘better’?
In my case, being a noob, the lowest possible learning curve and minimal complexity of access for someone who does not normally have a development environment set up, and who knows nothing about Ruby, (or indeed Linux) would be high priorities. Might be different if I had some other reason to (and time to) get up to speed with these things … In a perfect world, there would be some sort of GUI local windows app that could interrogate my Digital Ocean hosted Discourse setup directly…
If you decide to not make the changes directly to the database, you might find some of the commands described in this topic helpful: Administrative Bulk Operations. For example, it gives details about how to tag topics from the rails console. The most important thing is to take a backup of your site’s database before running any commands.
My benchmark for “Better” is “much less likely to leave your database in a broken state and completely broken.” Since you’re a “noob” you don’t know which tables need to be updated when you do . . . something.
How ever you do it, make sure you make frequent backups.
A valid point - I will try hard to find other ways in the first instance.
My ignorance is a large risk with either scenario, but is it true to say that conducting any database changes via Ruby will be safer than attempting the same changes using pg Admin4?
The risk of creating not immediately noticed damage has been alluded to for example - is there anything about one approach vs the other that can influence that risk?
In the back of my head, should I ever decide to risk it (after suitable backups taken) I envisaged a copy of pgAdmin4 running in my Digital Ocean droplet, that I could access directly via url in browser, rather than via cli console windows, and thus eliminating a couple of layers of complexity (am assuming here that this is even possible)
Pretty much. Ruby does a bunch of magic stuff to make sure that the Right Thing happens. For example, if you destroy (delete) something from a model, it knows when and what other stuff should be deleted. There’s plenty of “safe” stuff you can do with raw SQL, but I pretty much always do it in Rails if it’s possible.
Ah, that is good to know - thanks!
That looks potentially pretty useful - thanks!
How I solved
How can I directly edit Discourse database from a GUI? since it is not answered with what I was seeking.
Do not do this on a production machine.
This uses the PostgreSQL recommended admin tool pgAdmin 4
Note: Discourse was installed on Ubuntu 18.04 on WSL 2 on Windows 10 per Beginners Guide to Install Discourse on Windows 10 for Development
Using Install pgAdmin 4 on Ubuntu 20.04/18.04/16.04 as a template
$ echo "deb http://apt.postgresql.org/pub/repos/apt/ `lsb_release -cs`-pgdg main" |sudo tee /etc/apt/sources.list.d/pgdg.list deb http://apt.postgresql.org/pub/repos/apt/ bionic-pgdg main $ sudo apt update $ sudo apt install pgadmin4 pgadmin4-apache2 pgAdmin4 user email: postgres@localhost pgAdmin4 password: <password 1> $ sudo /etc/init.d/apache2 restart $ sudo ufw allow http $ sudo ufw allow https $ hostname -I Record <address> $ whoami Record <user name>
This next step may not be needed as I did not know how to get a Postgres DB user’s password as I am not a PostgreSQL expert, or if there was another way to set up the needed DB login for pgadmin4.
$ psql postgres
postgres=# ALTER ROLE <user name> '<password 2>';
Using Internet browser
http://<address>/pgadmin4 user: postgres@localhost password: <password 1>
Once pgAdmin4 starts
Create a sever connection
Tab: General Name: Discourse Development Server group: Servers Tab: Connection Host: localhost Port: 5432 Maintenance database: postgres Username: <user name> Password: <password 2>
This is not perfect but it works and is better than nothing. Feedback and suggestions welcome.
Software Catalogue - Administration/development tools
I find that for most actions it’s easier and a bit safer to access the rails console rather than directly interact with the database.
Or, if what you want to do is change a user’s password (oh, that’s not what you were trying to do, but this is still a fine example), do
cd /var/discourse ./launcher enter app rake admin:create
In spite of it’s name, that rake task will allow you to do
- create a user (but it’s fine if the user exists)
- change the password (but you don’t have to)
- make the user an admin (but you don’t have to).
Have a look at Administrative Bulk Operations for some other examples.
But here are a few:
users=User.all me=User.find_by_username ('pfaffman') me=User.find_by_email('firstname.lastname@example.org') UserEmail.create!(user: me, email: email@example.com') posts_with_uploads=Post.where("raw like '%upload%' ") Group.create( name: "mygreatgroup", automatic_membership_email_domains: 'literatecomputing.com', primary_group: true, title: "Literate Computing Staff", grant_trust_level: 4, flair_url: 'https://example.com/path.icon.png' )
Thanks for the feedback. Something more for me to learn.
While I have decades of development experience I have never used Ruby or Rails. I actually started programming with punch cards in college and personally with an Atari 800.
Yeah. I don’t know much about SQL that I didn’t learn (or was supposed to learn) in a database course in 1986.
The nice thing about rails is that that it knows how things are interconnected, so when you delete something, it deletes all the other stuff that depends on it, rather than leaving dangling stuff in your database. It also keeps you from setting things to values that don’t make sense.
I use those graphical tools when plunking about an unfamiliar database to do an import, but mostly never for Discourse, though I do look at
./db/schema.db (which I learned is generated magically and not available in github) to figure out where stuff is.