Discussion:
About migrations
(too old to reply)
Marcin Nowak
2017-06-23 09:25:31 UTC
Permalink
Hi.

At the begining please forgive my engilsh - i'm not a native speaker.

I wrote here and in other places my thoughts about db migrations few times,
and probably Tim remembers me so well.
My opinion was not changed, but I realized that I cannot leave Django
ecosystem for a long time.
In that case I'd like to talk about migrations, their advantages and
disadvantages, and about possible solutions.

Advantages:

- fast (automatic) creation
- database independent (model-centric)
- a standard for django itself and reusable apps
- possibilty to create migrations manually

Disadvantages:

- dependent on application layer (a python code - field, model classes,
etc)
- allowing python code within migrations
- separate files for changesets makes ugly conflicts when merging
branches
- squashing required

I'd like to focus on disadvantages, because they're a casue of using
alternative solutions by me (Liquibase in that case).


*Application layer dependency*

This is something whch causes fails of whole migration system.
References to the application layer are included within migration files,
because of saving a "model state".
Any significant code change will broke migrations. We must avoid such
situations by squashing migrations at "the right time".

In my opinion migrations should be application independent. Unfortunatelly
whole system is based on models written in Python, which may includue
custom solutions (i.e. model fields).

I understand that it is hard to cut-out this feature, but I believe that
there exist some solution, which drops app layer dependency ands allow
using custom fields.


*Python code in migrations*

This is a generally bad idea. Any python code is strictly related to the
time. When code changes, a "pythonic" migration may fail.
And you will never know about failure until you setup CI properly.

There are very rare cases, when something from app layer must be called
between releases.
In that cases I'm using management commands, but Liquibase allows me to
execute any binary.
It is not so straightforward, so as a developer you feel that you're doing
something strange/non-standard, and you should be careful.

The general problem is that some migrations stops working properly in the
time.
This should never happen. Well... I'm used to this and I adhere to this
principle.

I'm using Liquibase for years and my migrations broke few times mostly on
changesets based on binaries execution (a python code through manage
commands).


*Separate files for changesets*

I like Liquibase because of possibility of using files containing more
changesets.
When I need to split migration files, I can do that and use "include"
directive.
There are advantages: easy conflict resolution, just one file per database
and per current major release, and full changes history in older files.

Django produces spearate migraiton files. This causes conflicts and
reordering migrations by declarations ("depends").
And because squashing is almost required for long project, you're loosing a
changesets history.


*Squashing required*

For long-live projects this is a required operation. Squashing deletes
changes history and removes obsolete python code from migrations.
Squashing is a just a workaround tool for migrations design issues.



*What I need?*

Well, I'd like to simplify my work. That's obvious. Maintaining a Liquibase
changes outside Django is harder and requires more time, especially when
I'm adding new apps or upgrading existing ones.
But I don't want to switch back to the Django migrations, because of the
design issues described above (mostly beacuse of the app layer dependency
and squashing requirement).

I started prototyping a tool which translates Django migrations into pure
SQL, which can be embedded in a Liquibase changesets.
But Python migrations can't be translated to SQL, of course. And the worst
thing is that Django provides Python migrations even for contrib apps.

In that case I realised, that building such tool without changing a concept
of the Django migrations (by you - Django Developers), is a little
unreasonable, until Python code execution is accepted and used internally
(contenttypes, 0002).


*What I would like to achieve by this post?*

A discussion about removing app layer dependency and removing or limiting
RunPython usage, mostly.
This should eliminate requirement of squashing and increase migrations
stability.

Separate files vs big file is not important now. This is just unhandy, but
does not produce failures.


Kind Regards,
Marcin
--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-developers+***@googlegroups.com.
To post to this group, send email to django-***@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/bd46c905-4402-41b8-a56c-1783020467f5%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Andrew Godwin
2017-06-23 17:19:17 UTC
Permalink
Hi Marcin,

Some of these are problems, yes, but you have to understand they are
tradeoffs and the alternative is, in my opinion, worse.
I believe that there exist some solution, which drops app layer
dependency ands allow using custom fields

I would love to see what this would be. I spent a decade trying not to
import custom fields from the code to allow this and never succeeded; the
only way that I could make it work was to import them. Because Django
allows fields to change their type based on the database being talked to,
you can't bake the type into the migration at creation time.
Python code in migrations
This has always been optional; Django's makemigrations will never put in
RunPython for you, so if you choose to use it, then it's your choice and
you get the problems along with it (more tied to application code).
Separate files for changesets
This is needed if you are going to have migrations per-app and thus let
apps ship migrations. If you want to not have per-app migrations, then you
can have just the one file, but then you make third-party apps have to ship
migration snippets you need to remember to include when you make the next
migration.
Squashing is a just a workaround tool for migrations design issues
Yes, specifically it is designed to help solve the
custom-fields-app-dependency issue. It's also just nice to have anyway when
you get to 100 migrations.
A discussion about removing app layer dependency and removing or limiting
RunPython usage, mostly.

OK, so what is your proposed alternative for custom fields? It is no good
to propose to remove something without proposing what to do in its place;
we can't drop custom field support in migrations. I think RunPython is
solvable by our users; if you want to limit usage, then just have a linter
rule in your project that rejects RunPython being committed by your
developers.

Andrew
Hi.
At the begining please forgive my engilsh - i'm not a native speaker.
I wrote here and in other places my thoughts about db migrations few
times, and probably Tim remembers me so well.
My opinion was not changed, but I realized that I cannot leave Django
ecosystem for a long time.
In that case I'd like to talk about migrations, their advantages and
disadvantages, and about possible solutions.
- fast (automatic) creation
- database independent (model-centric)
- a standard for django itself and reusable apps
- possibilty to create migrations manually
- dependent on application layer (a python code - field, model
classes, etc)
- allowing python code within migrations
- separate files for changesets makes ugly conflicts when merging
branches
- squashing required
I'd like to focus on disadvantages, because they're a casue of using
alternative solutions by me (Liquibase in that case).
*Application layer dependency*
This is something whch causes fails of whole migration system.
References to the application layer are included within migration files,
because of saving a "model state".
Any significant code change will broke migrations. We must avoid such
situations by squashing migrations at "the right time".
In my opinion migrations should be application independent.
Unfortunatelly whole system is based on models written in Python, which may
includue custom solutions (i.e. model fields).
I understand that it is hard to cut-out this feature, but I believe that
there exist some solution, which drops app layer dependency ands allow
using custom fields.
*Python code in migrations*
This is a generally bad idea. Any python code is strictly related to the
time. When code changes, a "pythonic" migration may fail.
And you will never know about failure until you setup CI properly.
There are very rare cases, when something from app layer must be called
between releases.
In that cases I'm using management commands, but Liquibase allows me to
execute any binary.
It is not so straightforward, so as a developer you feel that you're doing
something strange/non-standard, and you should be careful.
The general problem is that some migrations stops working properly in the
time.
This should never happen. Well... I'm used to this and I adhere to this
principle.
I'm using Liquibase for years and my migrations broke few times mostly on
changesets based on binaries execution (a python code through manage
commands).
*Separate files for changesets*
I like Liquibase because of possibility of using files containing more
changesets.
When I need to split migration files, I can do that and use "include"
directive.
There are advantages: easy conflict resolution, just one file per database
and per current major release, and full changes history in older files.
Django produces spearate migraiton files. This causes conflicts and
reordering migrations by declarations ("depends").
And because squashing is almost required for long project, you're loosing
a changesets history.
*Squashing required*
For long-live projects this is a required operation. Squashing deletes
changes history and removes obsolete python code from migrations.
Squashing is a just a workaround tool for migrations design issues.
*What I need?*
Well, I'd like to simplify my work. That's obvious. Maintaining a
Liquibase changes outside Django is harder and requires more time,
especially when I'm adding new apps or upgrading existing ones.
But I don't want to switch back to the Django migrations, because of the
design issues described above (mostly beacuse of the app layer dependency
and squashing requirement).
I started prototyping a tool which translates Django migrations into pure
SQL, which can be embedded in a Liquibase changesets.
But Python migrations can't be translated to SQL, of course. And the worst
thing is that Django provides Python migrations even for contrib apps.
In that case I realised, that building such tool without changing a
concept of the Django migrations (by you - Django Developers), is a little
unreasonable, until Python code execution is accepted and used internally
(contenttypes, 0002).
*What I would like to achieve by this post?*
A discussion about removing app layer dependency and removing or limiting
RunPython usage, mostly.
This should eliminate requirement of squashing and increase migrations
stability.
Separate files vs big file is not important now. This is just unhandy, but
does not produce failures.
Kind Regards,
Marcin
--
You received this message because you are subscribed to the Google Groups
"Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/
msgid/django-developers/bd46c905-4402-41b8-a56c-
1783020467f5%40googlegroups.com
<https://groups.google.com/d/msgid/django-developers/bd46c905-4402-41b8-a56c-1783020467f5%40googlegroups.com?utm_medium=email&utm_source=footer>
.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-developers+***@googlegroups.com.
To post to this group, send email to django-***@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/CAFwN1ur8c-_cNyA2aHavRdsncx%2Bcz1tsZt7V5RqYgLmky8MwCw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
Marcin Nowak
2017-06-23 18:17:27 UTC
Permalink
Post by Andrew Godwin
Some of these are problems, yes, but you have to understand they are
tradeoffs and the alternative is, in my opinion, worse.
Yes, I understand. But maybe there is another / better alternative.
Let's simplify a little and talk about a python deps first.
Post by Andrew Godwin
I believe that there exist some solution, which drops app layer
dependency ands allow using custom fields
I would love to see what this would be. I spent a decade trying not to
import custom fields from the code to allow this and never succeeded; the
only way that I could make it work was to import them. Because Django
allows fields to change their type based on the database being talked to,
you can't bake the type into the migration at creation time
The advantages comes from db type independency, this is true, but in the
other side you're including the app layer dependency.

Let's imagine that one of builtin field will change it's definition.
Running migrations on two different Django versions will produce two
different outputs.
My perspective is more database-like than app-like, so I'm expecting same
db schema as a result (for both cases).

So the first thing that comes into my mind sounds: a complete definiton
should be baked in migration file. Then, when app layer changes (i.e.
upgrading framework or changing custom field definition), the migration
system should identify the change and produce new migration with baked in
definition. If it is possible to develop, you'll achieve less dependencies.
The definition (a meta-description of the field) will be baked in, instead
of depending on the field itself. And you'll preserve database type
independency.

This is a just first concept that comes to my mind, now.
Post by Andrew Godwin
Python code in migrations
This has always been optional; Django's makemigrations will never put in
RunPython for you, so if you choose to use it, then it's your choice and
you get the problems along with it (more tied to application code).
In that case I'd like to avoid RunPython in Django's contrib apps builtin
migrations, not to remove the possibility of running any executable.
I'd just like to add comment to it "do it at your own risk" ;)
Post by Andrew Godwin
Separate files for changesets
This is needed if you are going to have migrations per-app and thus let
apps ship migrations. If you want to not have per-app migrations, then you
can have just the one file, but then you make third-party apps have to ship
migration snippets you need to remember to include when you make the next
migration.
Leave it as is for a while. It isn't so important.
Post by Andrew Godwin
Squashing is a just a workaround tool for migrations design issues
Yes, specifically it is designed to help solve the
custom-fields-app-dependency issue. It's also just nice to have anyway when
you get to 100 migrations.
The second isn't an issue, in practice. After x years I have thousands of
migrations in Liquibase, and the only one downside of this lies in time
required to run them all in CI build. But this is automated, so nobody
cares about minute or two.
Post by Andrew Godwin
A discussion about removing app layer dependency and removing or
limiting RunPython usage, mostly.
OK, so what is your proposed alternative for custom fields? It is no good
to propose to remove something without proposing what to do in its place;
we can't drop custom field support in migrations.
To be precise - I don't want to remove anything and break compatibility.
I'd like to improve some things.
The (first) proposal is about decoupling migrations from the app layer. I
wrote the example few lines above.

I don't want to write all possible proposals at this moment, because I
suppose that you discussed the topic a very long time.
We can focus now on the sepearation of a field definitions, to achieve
consistency in time.

BR,
Marcin
--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-developers+***@googlegroups.com.
To post to this group, send email to django-***@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/ed5f17fe-3967-494d-8423-7d0143282ac9%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Andrew Godwin
2017-06-23 19:27:39 UTC
Permalink
Post by Marcin Nowak
The advantages comes from db type independency, this is true, but in the
other side you're including the app layer dependency.
Let's imagine that one of builtin field will change it's definition.
Running migrations on two different Django versions will produce two
different outputs.
My perspective is more database-like than app-like, so I'm expecting same
db schema as a result (for both cases).
So the first thing that comes into my mind sounds: a complete definiton
should be baked in migration file. Then, when app layer changes (i.e.
upgrading framework or changing custom field definition), the migration
system should identify the change and produce new migration with baked in
definition. If it is possible to develop, you'll achieve less dependencies.
The definition (a meta-description of the field) will be baked in, instead
of depending on the field itself. And you'll preserve database type
independency.
How do you propose to identify "when the app layer changes"? This is a
harder problem to solve that it first appears; the only thing you can rely
on to compare to are the migration files themselves, so that necessarily
means you need some description of the app layer in there.
Post by Marcin Nowak
In that case I'd like to avoid RunPython in Django's contrib apps builtin
migrations, not to remove the possibility of running any executable.
I'd just like to add comment to it "do it at your own risk" ;)
It's only in one migration (
https://github.com/django/django/blob/master/django/contrib/contenttypes/migrations/0002_remove_content_type_name.py),
and this is because ContentTypes are something that are not purely
database-specific. I personally don't like contenttypes anyway, so I would
be a fan of making the whole thing vanish into the night, but that's not my
call and it would have backwards-compat issues.
Post by Marcin Nowak
The second isn't an issue, in practice. After x years I have thousands of
migrations in Liquibase, and the only one downside of this lies in time
required to run them all in CI build. But this is automated, so nobody
cares about minute or two.
I am impressed that your thousands of migrations only take a few minutes to
run; you must have a decent database backing it. Some backends are much
slower than this. Squash is offered as an option, and not required; and
some people don't even squash and just reset their migrations back to 0001.
Post by Marcin Nowak
To be precise - I don't want to remove anything and break compatibility.
I'd like to improve some things.
The (first) proposal is about decoupling migrations from the app layer. I
wrote the example few lines above.
I don't want to write all possible proposals at this moment, because I
suppose that you discussed the topic a very long time.
We can focus now on the sepearation of a field definitions, to achieve
consistency in time.
Understand when I say that what you are proposing is a very, very big
change. Django's ORM is heavily coupled to runtime information and the app
layer, and I tried for many years to decouple them and ran into all sorts
of issues as a result. Importing the fields from the source code ended up
being the easiest, safest method that also happens to produce very
easy-to-understand errors when it breaks (rather than using old definitions
or silently failing).

I am all for migration improvements, but the overall shape of what you are
suggesting seems like changing a few fundamental principles of how
migrations are designed and essentially designing one of the other types of
system (like django-evolution, or dmigrations). If you want a different
kind of system, then you are more than welcome to develop one; Django
migrations are very deliberately kept separate from the schema-changing
backends (SchemaEditor), so it's easier to write custom migration solutions
without having to redo all of the nasty per-database code and SQL
generation.

However, proposing to change the core of the way Django works is going to
come with a very high bar and me and others are going to want to see
concrete proof of backwards-compatability, improvements to developer
experience, and a person or people who are willing to put in all the work
to make and land the patch. Personally, I would probably ask that a
proposed alternate system was developed as a separate library to prove the
concept first (re-using all the bits of Django it needs to, like
SchemaEditor and some of the operations code)

Andrew
--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-developers+***@googlegroups.com.
To post to this group, send email to django-***@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/CAFwN1uov0okRohsnvigD7T6GcdQMN9AmW16V8Ubv5Cw4v%2BHR6Q%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
Marcin Nowak
2017-06-23 22:02:22 UTC
Permalink
Post by Andrew Godwin
Understand when I say that what you are proposing is a very, very big
change. Django's ORM is heavily coupled to runtime information and the app
layer, and I tried for many years to decouple them and ran into all sorts
of issues as a result. Importing the fields from the source code ended up
being the easiest, safest method that also happens to produce very
easy-to-understand errors when it breaks (rather than using old definitions
or silently failing).
Understood.

From my perspective, from few years work with Liquibase (started in 2008
maybe), I had no failures except migration logic errors (due to ordering)
or executing binaries.
This solution is completely separated from app logic, it is hell-stable,
and supports databases longer than any app lifetime.

The problem with external solution lies in changesets creation, which is a
completely manual process, where changes from 3rd party apps or Django
itself cannot be automatically applied.
The only one what is really problematic with Django migrations is "heavy
coupling to app layer", as you said, which may cause migrations system
failure when app layer is changing.

I'm trying to find a solution to achieve both - stability and automation
(or semi-automation).

I am impressed that your thousands of migrations only take a few minutes to
Post by Andrew Godwin
run;
Believe me or not, I was thinking about hundreds. Sorry for a mistake. It
may be related to my poor english skills.
Currently there are 1480 changesets, to be precise. They're applying within
2-3 minutes on two databases (sequentially) on a poor jenkins machine.

BR,
Marcin
--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-developers+***@googlegroups.com.
To post to this group, send email to django-***@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/22fc3ae1-379e-4bca-aadf-5b9dc07a4bb0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Marcin Nowak
2017-06-27 13:07:50 UTC
Permalink
*About database agnostic migrations.*

Liquibase is a tool which is decoupled from the app layer and gives a
possibility to write agnostic changesets.
To be independent you must use builtin operations and narrow used field
types to "well known" subset
<http://www.liquibase.org/documentation/column.html>.
Anything else is passed through, same as plain SQLs are passed througd.

So as a DB architect I can decide what is most important to me -
portability or db-specific features.

This concept may be easily adopted to the Django migrations, because the
operations are currently implemented.


*About custom fields*

Liquibase is extensible. The concept is about extending a "well-know"
fields subset by registering extensions.
For example - django.contrib.postgres may provide an extension to the
migration subsystem by registering new types and new operations.

For 3rd party apps there will be requirement to deliver extension to a
migration system, not to the project itself.
This would be real "game changer". Until the extension is installed, your
migrations will work.
You may recreate you app from scratch but leave extensions - and your db
refactorings will be still safe.


*About files separation*

Liquibase has the include operation. The similar may be implemented for the
Django migrations.
The only problem is that the Python code is rather unordered, and some
explicit ordering should be introduced.

This would be quite irrelevant for Python-based changesets (migration
files), but after dropping direct application layer dependency there would
be possibility to use other language than Python to describe operations to
aplly.


*Application layer agnostic migrations*

The migrations subsystem must match their field types to the registered
one, and render a changeset (migration file) using these registered *names *instead
of direct field classes.
Anything unmatched must be passed as a CUSTOM type or just passed through
by name (may introduce incompatibility, when someone will register a field
of the same name in the future).
How to mark a custom type requires discussion, of course.


*Backward compatibility*

You can prevent backward compatibilty by allowing usage of directly
imported fields, same as nowdays. In that case the field-mapping layer will
be bypassed.


*Automatic changes detection*

Should work same as nowdays. But I must dig into internals to confirm that.


Marcin
--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-developers+***@googlegroups.com.
To post to this group, send email to django-***@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/dca84944-a35b-47ea-9463-a64ea2c48343%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Patryk Zawadzki
2017-07-04 14:04:09 UTC
Permalink
W dniu piątek, 23 czerwca 2017 21:28:07 UTC+2 uÅŒytkownik Andrew Godwin
Post by Andrew Godwin
Post by Marcin Nowak
The advantages comes from db type independency, this is true, but in the
other side you're including the app layer dependency.
Let's imagine that one of builtin field will change it's definition.
Running migrations on two different Django versions will produce two
different outputs.
My perspective is more database-like than app-like, so I'm expecting same
db schema as a result (for both cases).
So the first thing that comes into my mind sounds: a complete definiton
should be baked in migration file. Then, when app layer changes (i.e.
upgrading framework or changing custom field definition), the migration
system should identify the change and produce new migration with baked in
definition. If it is possible to develop, you'll achieve less dependencies.
The definition (a meta-description of the field) will be baked in, instead
of depending on the field itself. And you'll preserve database type
independency.
How do you propose to identify "when the app layer changes"? This is a
harder problem to solve that it first appears; the only thing you can rely
on to compare to are the migration files themselves, so that necessarily
means you need some description of the app layer in there.
Have DB backends understand certain field types expressed as strings
("varchar", "text", "blob", "decimal" and so on).

Possibly some backends could implement a wider set than the others ("json",
"xml", "rasterimage" etc.).

Have each Field class deconstruct to a field name and params, eg:
"decimal", {"digits": 12, "decimals": 2}.

Then a model becomes essentially a list of tuples:

[
("title", "varchar", {"length": 100}),
("price", "decimal", {"digits": 12, "decimals": 2}),
...
]

This is not far from what "render model states" does currently except that
it compares much richer model descriptions that leads to no-op migrations
being generated each time you change a label or a user-visible part of
choices.
--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-developers+***@googlegroups.com.
To post to this group, send email to django-***@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/5efab6d9-5c76-4338-a3e3-6f55ec752c31%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Marcin Nowak
2017-07-04 14:13:23 UTC
Permalink
Have each Field class deconstruct to a field name and params [...]
Thanks, @patrys. A field deconstruction is a key to achieve what I tried to
describe earlier.
We can discuss the details about implementation, but this is not important
now.

Marcin
--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-developers+***@googlegroups.com.
To post to this group, send email to django-***@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/62f0734e-5519-4935-a6fe-aace4b31aa76%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Carl Meyer
2017-07-04 21:49:07 UTC
Permalink
Post by Patryk Zawadzki
Have DB backends understand certain field types expressed as strings
("varchar", "text", "blob", "decimal" and so on).
Possibly some backends could implement a wider set than the others
("json", "xml", "rasterimage" etc.).
"decimal", {"digits": 12, "decimals": 2}.
[
("title", "varchar", {"length": 100}),
("price", "decimal", {"digits": 12, "decimals": 2}),
...
]
This is not far from what "render model states" does currently except
that it compares much richer model descriptions that leads to no-op
migrations being generated each time you change a label or a
user-visible part of choices.
Right, and one reason for generating those "no-op" migrations is that
they aren't actually no-ops, if you value being able to write data
migrations in Python using the ORM. They keep the historical Python
models accurate.

Of course, we do pay a cost in complexity for the "historical ORM"
feature, and it's reasonable to prefer a tradeoff that doesn't pay that
cost and requires all data migrations to be written in SQL. As Andrew
mentioned, there's nothing to prevent anyone from writing an alternative
migrations frontend that takes this approach. It should be able to reuse
the schema editor backend, which does the heavy lifting of cross-db
schema alteration.

It's worth remembering, though, that five or six years ago we _had_ a
range of different migrations solutions that chose different tradeoffs,
and South was the clear winner in user uptake. It's not due to arbitrary
whim that the Django migrations system is based on South and preserves
its popular features, like the historical ORM.

Carl
--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-developers+***@googlegroups.com.
To post to this group, send email to django-***@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/43abcfed-edaa-6f05-04cc-f30a9b0e59e6%40oddbird.net.
For more options, visit https://groups.google.com/d/optout.
Patryk Zawadzki
2017-07-07 12:09:07 UTC
Permalink
Post by Carl Meyer
Post by Patryk Zawadzki
Have DB backends understand certain field types expressed as strings
("varchar", "text", "blob", "decimal" and so on).
Possibly some backends could implement a wider set than the others
("json", "xml", "rasterimage" etc.).
"decimal", {"digits": 12, "decimals": 2}.
[
("title", "varchar", {"length": 100}),
("price", "decimal", {"digits": 12, "decimals": 2}),
...
]
This is not far from what "render model states" does currently except
that it compares much richer model descriptions that leads to no-op
migrations being generated each time you change a label or a
user-visible part of choices.
Right, and one reason for generating those "no-op" migrations is that
they aren't actually no-ops, if you value being able to write data
migrations in Python using the ORM. They keep the historical Python
models accurate.
I would argue that this is a fairly optimistic view of the current state :)

They are technically "historically accurate" but the point in history they
represent is not necessary the one you had in mind unless you only have a
single application and linear migrations (ie. no merge migrations). Our
current dependency system only allows you to express "no sooner than X" but
the graph solver can execute an arbitrary number of later migrations
between the one you depend on and the one you wrote.

Imagine you have app A and migration M1 adds field F. You then create a
migration M2 in another application B that needs to access F so you have it
depend on (A, M1). Two months later field F is removed or renamed in
migration M3. Django has two ways to linearize the graph: (A, M1), (B, M2),
(A, M3) or (A, M1), (A, M3), (B, M2). Both are valid but the latter will
result in a crash when migrating from an empty DB state. In practice we
often have to add arbitrary dependencies to later migrations to force a
Python migration to execute in the correct order.

Also I'd argue that (from my personal experience which is obviously
limited) having access to historical "choices", a form field label or the
hint are not all that useful. In fact I'd be happy with a limited migration
system that always returns bare database values without executing any of
the field code. Writing portable migrations would be a bit more work but
it's mostly a price apps would pay as projects themselves rarely need to be
portable.

Anyway, I don't want anyone to think that I complain as I don't have the
resources to write yet another migration tool and both South and Django
migrations beat writing SQL by hand.
--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-developers+***@googlegroups.com.
To post to this group, send email to django-***@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/ea0381de-d710-4467-8e4f-9d371f3154e4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Marcin Nowak
2017-07-07 12:42:19 UTC
Permalink
Post by Patryk Zawadzki
Anyway, I don't want anyone to think that I complain as I don't have the
resources to write yet another migration tool and both South and Django
migrations beat writing SQL by hand.
Have you tried Liquibase ever? It is very reliable, unfortunatelly it is
missing automatic changesets generation (because models aren't tracked) and
you must rewrite Django's and 3rd party apps migrations by yourself.
I started this topic to talk about an improvement to the Django migrations,
to provide some advantages known from Liquibase.
Knowing LB may help everyone to understand my proposals.

So the most important for me is a separation from an application layer to
improve stability.
The 2nd one is having a possibility to direct use of Django and 3rd party
app migrations, to help everyone making upgrades.
The 3rd one is a "dependency hell", where project-wide and flat files of
sequences are easier to maintain, especially when project dependencies
(i.e. 3rd party apps) are changing.
They are very important for a long-term projects.

Marcin
--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-developers+***@googlegroups.com.
To post to this group, send email to django-***@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/cf0e4d55-2df8-41cf-8779-c7281be54733%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Carl Meyer
2017-07-07 17:36:09 UTC
Permalink
Post by Carl Meyer
Right, and one reason for generating those "no-op" migrations is that
they aren't actually no-ops, if you value being able to write data
migrations in Python using the ORM. They keep the historical Python
models accurate.
I would argue that this is a fairly optimistic view of the current state :)
They are technically "historically accurate" but the point in history
they represent is not necessary the one you had in mind unless you only
have a single application and linear migrations (ie. no merge
migrations). Our current dependency system only allows you to express
"no sooner than X" but the graph solver can execute an arbitrary number
of later migrations between the one you depend on and the one you wrote.
Imagine you have app A and migration M1 adds field F. You then create a
migration M2 in another application B that needs to access F so you have
it depend on (A, M1). Two months later field F is removed or renamed in
migration M3. Django has two ways to linearize the graph: (A, M1), (B,
M2), (A, M3) or (A, M1), (A, M3), (B, M2). Both are valid but the latter
will result in a crash when migrating from an empty DB state. In
practice we often have to add arbitrary dependencies to later migrations
to force a Python migration to execute in the correct order.
Yeah, that's an issue I've certainly run into. It's not _that_
unreasonable or arbitrary to solve this by adding a dependency of (A,
M3) on (B, M2), but it would be better if we could express this as a
"must-run-before" dependency on the B side (since the dependency of app
B on app A may be one-way, and we shouldn't have to introduce knowledge
of B into A's migrations -- and A may even be third-party). IMO this
would be a reasonable feature addition.

Carl
--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-developers+***@googlegroups.com.
To post to this group, send email to django-***@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/042cd6a3-1e7c-b5a5-7909-2ef90f91c284%40oddbird.net.
For more options, visit https://groups.google.com/d/optout.
Andrew Godwin
2017-07-07 21:53:30 UTC
Permalink
There is already a run-before constraint you can add to migrations for
exactly this purpose! It's called "run_before" and is in the same format as
the dependencies IIRC.

Andrew
Post by Carl Meyer
Post by Carl Meyer
Right, and one reason for generating those "no-op" migrations is that
they aren't actually no-ops, if you value being able to write data
migrations in Python using the ORM. They keep the historical Python
models accurate.
I would argue that this is a fairly optimistic view of the current state
:)
Post by Carl Meyer
They are technically "historically accurate" but the point in history
they represent is not necessary the one you had in mind unless you only
have a single application and linear migrations (ie. no merge
migrations). Our current dependency system only allows you to express
"no sooner than X" but the graph solver can execute an arbitrary number
of later migrations between the one you depend on and the one you wrote.
Imagine you have app A and migration M1 adds field F. You then create a
migration M2 in another application B that needs to access F so you have
it depend on (A, M1). Two months later field F is removed or renamed in
migration M3. Django has two ways to linearize the graph: (A, M1), (B,
M2), (A, M3) or (A, M1), (A, M3), (B, M2). Both are valid but the latter
will result in a crash when migrating from an empty DB state. In
practice we often have to add arbitrary dependencies to later migrations
to force a Python migration to execute in the correct order.
Yeah, that's an issue I've certainly run into. It's not _that_
unreasonable or arbitrary to solve this by adding a dependency of (A,
M3) on (B, M2), but it would be better if we could express this as a
"must-run-before" dependency on the B side (since the dependency of app
B on app A may be one-way, and we shouldn't have to introduce knowledge
of B into A's migrations -- and A may even be third-party). IMO this
would be a reasonable feature addition.
Carl
--
You received this message because you are subscribed to the Google Groups
"Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/
msgid/django-developers/042cd6a3-1e7c-b5a5-7909-2ef90f91c284%40oddbird.net
.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-developers+***@googlegroups.com.
To post to this group, send email to django-***@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/CAFwN1uraMop3%3D6U%3D6CP0dQNPsGBTmcbVQ5eYLkB3ikr9BiptuQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
Adam Johnson
2017-07-07 22:36:10 UTC
Permalink
Docs link:
https://docs.djangoproject.com/en/1.11/howto/writing-migrations/#controlling-the-order-of-migrations
Post by Andrew Godwin
There is already a run-before constraint you can add to migrations for
exactly this purpose! It's called "run_before" and is in the same format as
the dependencies IIRC.
Andrew
Post by Patryk Zawadzki
Post by Carl Meyer
Right, and one reason for generating those "no-op" migrations is
that
Post by Carl Meyer
they aren't actually no-ops, if you value being able to write data
migrations in Python using the ORM. They keep the historical Python
models accurate.
I would argue that this is a fairly optimistic view of the current
state :)
Post by Carl Meyer
They are technically "historically accurate" but the point in history
they represent is not necessary the one you had in mind unless you only
have a single application and linear migrations (ie. no merge
migrations). Our current dependency system only allows you to express
"no sooner than X" but the graph solver can execute an arbitrary number
of later migrations between the one you depend on and the one you wrote.
Imagine you have app A and migration M1 adds field F. You then create a
migration M2 in another application B that needs to access F so you have
it depend on (A, M1). Two months later field F is removed or renamed in
migration M3. Django has two ways to linearize the graph: (A, M1), (B,
M2), (A, M3) or (A, M1), (A, M3), (B, M2). Both are valid but the latter
will result in a crash when migrating from an empty DB state. In
practice we often have to add arbitrary dependencies to later migrations
to force a Python migration to execute in the correct order.
Yeah, that's an issue I've certainly run into. It's not _that_
unreasonable or arbitrary to solve this by adding a dependency of (A,
M3) on (B, M2), but it would be better if we could express this as a
"must-run-before" dependency on the B side (since the dependency of app
B on app A may be one-way, and we shouldn't have to introduce knowledge
of B into A's migrations -- and A may even be third-party). IMO this
would be a reasonable feature addition.
Carl
--
You received this message because you are subscribed to the Google Groups
"Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/ms
gid/django-developers/042cd6a3-1e7c-b5a5-7909-2ef90f91c284%40oddbird.net.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups
"Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/
msgid/django-developers/CAFwN1uraMop3%3D6U%3D6CP0dQNPsGBTmcbVQ5eYLkB3ikr9
BiptuQ%40mail.gmail.com
<https://groups.google.com/d/msgid/django-developers/CAFwN1uraMop3%3D6U%3D6CP0dQNPsGBTmcbVQ5eYLkB3ikr9BiptuQ%40mail.gmail.com?utm_medium=email&utm_source=footer>
.
For more options, visit https://groups.google.com/d/optout.
--
Adam
--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-developers+***@googlegroups.com.
To post to this group, send email to django-***@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/CAMyDDM2P-qd6Sz9MYU8VRE%2BFvQhBNxjMKhpRpPiPih650dJ_dg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
Patryk Zawadzki
2017-07-11 13:12:37 UTC
Permalink
Post by Andrew Godwin
There is already a run-before constraint you can add to migrations for
exactly this purpose! It's called "run_before" and is in the same format as
the dependencies IIRC.
The problem with "run before X" is that there is no "X" at the point in
time where you write that migration.

I think it would be more robust if Django tried to run migrations _from
other apps_ that depend on the currently executing migration before
proceeding to run the next migration from the same app.
--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-developers+***@googlegroups.com.
To post to this group, send email to django-***@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/8e89de6a-3a4b-4a53-8455-b1b62ca293fd%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Continue reading on narkive:
Loading...