Commit Graph

9 Commits

Author SHA1 Message Date
2fa0dc58d4 DEV: Allow multiple users to have the same email in IntermediateDB (#33318)
The import script will merge the users based on email
2025-06-24 22:59:00 +02:00
b31189e0b4 DEV: Add --reset option to import command (#33317)
It deletes the MappingDB before executing the import. That's useful
during development when you repeat the import multiple times.
2025-06-24 22:58:32 +02:00
95f4460394 DEV: Import uploads from UploadsDB (#33259) 2025-06-20 22:59:57 +02:00
d304e708be FIX: Stop silently dropping first two rows during load_mapping (#33076)
Currently, the first two rows returned by `DiscourseDB#query_array` are
silently dropped during the column size check in
`DiscourseDB#load_mapping`. This happens because the rows object, while
an enumerator, isn't fully compliant, it doesn't rewind during
introspection. As a result, calls like `#first`, `#peek`, or `#any?`
advance the iterator.

Ideally, we’d fix this by updating the `query_array` enumeration
implementation. However, customizing the enumerator to be fully
compliant would likely introduce unnecessary perf overhead for all use
cases. So, this fix works around that limitation by building the map a
little differently.
2025-06-09 23:26:59 +02:00
a48f33fda0 FIX: Ensure copy_data callbacks run even when all rows are skipped (#33002)
Currently, if a batch "copy" of an import step results in all rows being
skipped, the `after_commit_of_skipped_rows` callback is never triggered.
This happens because the callback is nested inside a block that only
runs when at least one row is inserted.

This change ensures the DB copy operation returns both inserted and
skipped rows, allowing the caller to respond appropriately in either
case.

---------

Co-authored-by: Gerhard Schlager <gerhard.schlager@discourse.org>
2025-06-02 23:07:28 +02:00
fc9946f595 DEV: Add converter & importer for permalink_normalizations 2025-05-31 22:17:44 +02:00
251cac39af DEV: Adds a basic importer for the IntermediateDB
* It only imports users and emails so far
* It stores mapped IDs and usernames in a SQLite DB. In the future, we might want to copy those into the Discourse DB at the end of a migration.
* The importer is split into steps which can mostly be configured with a simple DSL
* Data that needs to be shared between steps can be stored in an instance of the `SharedData` class
* Steps are automatically sorted via their defined dependencies before they are executed
* Common logic for finding unique names (username, group name) is extracted into a helper class
* If possible, steps try to avoid loading already imported data (via `mapping.ids` table)
* And steps should select the `discourse_id` instead of the `original_id` of mapped IDs via SQL
2025-04-07 17:22:36 +02:00
17ba19c7ae REFACTOR: Code generator for migrations IntemerdiateDB
* Splits the existing script into multiple classes
* Adds command for generating IntermediateDB schema (`migrations/bin/cli schema generate`)
* Changes the syntax of the IntermediateDB schema config
* Adds validation for the schema config
* It uses YAML schema aka JSON schema to validate the config file
* It generates the SQL schema file and Ruby classes for storing data in the IntermediateDB
2025-04-07 17:22:36 +02:00
d286c1d5a1 DEV: Prepare new structure for migrations-tooling (#26631)
* Moves existing files around. All essential scripts are in `migrations/bin`, and non-essential scripts like benchmarks are in `migrations/scripts`
* Dependabot configuration for migrations-tooling (disabled for now)
* Updates test configuration for migrations-tooling
* Shorter configuration for intermediate DB for now. We will add the rest table by table.
* Adds a couple of benchmark scripts
* RSpec setup especially for migrations-tooling and the first tests
* Adds sorting/formatting to the `generate_schema` script
2024-04-15 18:47:40 +02:00