HomeAboutPostsTagsProjectsRSS

Django

Updated
Words1127
TagsRead3 minutes

Notes about Django migration generated SQL

I have recently transitioned a service from Flyway to Django-migration based database management. To ensure a smooth data migration process, I need to verify that the Django-migrations generated DDL is compatible with the existing one.

pytest-django: how to create empty database for test cases

I am using pytest with the pytest-django plugin to write unit tests that compare the generated raw SQLs. I have two test cases, both of them start with empty database, one test case executes the Flyway migration, the other test case applies Django migrations. If both test cases pass the same assertions of the database, for example database contains certain tables, indexes, enum type, constraints, etc.), I can be confident about the Django migration files.

The issue is that pytest-django creates a test database instance with Django migrations executed by default. The –no-migrations option does not create an empty database instance. Instead, it disables Django migration execution and creates tables by inspecting the Django models.

I would like pytest-django to have an option to disable Django migration execution, allowing for an empty database instance to be created. This would enable me to test the compatibility of my Django migration files more effectively.

Solution

The solution is to use a custom django_db_setup fixture for the test cases.

@pytest.fixture
def django_db_setup(django_db_blocker):
    """Custom db setup that creates a new empty test db without any tables."""

    original_db = connection.settings_dict["NAME"]
    test_db = "test_" + uuid.uuid4().hex[:8]

    # First, connect to default database to create test database
    with django_db_blocker.unblock():
        with connection.cursor() as cursor:
            print(f"CREATE DATABASE {test_db}")
            cursor.execute(f"CREATE DATABASE {test_db}")

    # Update connection settings to use test database
    for alias in connections:
        connections[alias].settings_dict["NAME"] = test_db

    # Close all existing connections
    # force new connection to be created with updated settings
    for alias in connections:
        connections[alias].close()

    yield

    # Restore the default database name
    # so it won't affect other tests
    for alias in connections:
        connections[alias].settings_dict["NAME"] = original_db

    # Close all existing connections
    # force new connection to be created with updated settings
    for alias in connections:
        connections[alias].close()

Django generated foreign key with deferrable constraints

While comparing the generated DDL, i noticed that in Django-generated DDL, foreign key constraints has a DEFERRABLE INITIALLY DEFERRED. This constraint means checking is delayed until transaction end.

It allows temporary violations of the foreign key constraint within a transaction, this can be helpful for inserting related records in any order within a transaction.

Django’s ORM is designed to work with deferrable constraints:

  • It can help prevent issues when saving related objects, especially in complex transactions
  • Some Django features (like bulk_create with related objects) work better with deferrable constraints

No Downside for Most Applications:

  • Deferrable constraints still ensure data integrity by the end of each transaction
  • The performance impact is typically negligible
  • If a constraint must be checked immediately, it can still enforce it at the application level

So I keep the Django-generated foreign key constraints and consider following two are equivalent

FOREIGN KEY (manufacturer) REFERENCES organizations(id)

FOREIGN KEY (manufacturer) REFERENCES organizations(id) DEFERRABLE INITIALLY DEFERRED

Updated
Words429
TagsRead3 minutes

Build Your Own Workflow: Automating Tasks for Efficiency

Emacs is all about tailoring the text editor to fit your own needs, and Lisp program language is all about building the right abstraction for the problem. After years of using Emacs/writing Emacs lisp, one lesson stands out: whenever you find yourself repeating a task, it’s time to build your own tools and create a personalized workflow. A good automated workflow isn’t just about saving time—it’s about using automation to offload tedious, repetitive details, freeing up mental energy for the things that truly matter in the task. With the rise of Large Language Models (LLMs), even tasks that previously required fuzzy logic can now be automated.

On mobile, iOS Shortcuts offers a powerful way to design custom workflows. Pair it with tools like a-Shell, Scriptable, and Data Jar, and you can create powerful automation that integrates seamlessly with your daily tasks.

Example 1: Automating Stroke Order Lookup

Sometimes, I forget the stroke order of a Chinese character and need to look it up. Instead of using third-party apps or look it up on the website every time, I built a Scriptable script that handles this for me automatically. It visit zdict.net, download and display the GIF. Now, with one tap, I can quickly find the stroke order without breaking my flow.

Example 2: Automating note taking of Language Learning

When learning German, I often need to record new vocabulary. To streamline this, I crafted a custom prompt for ChatGPT to help with translations and explanations, and then log the results into [[Obsidian]] for future study. I linked these steps into a single iOS Shortcut, making the process completely automated—from asking the question to saving the notes.

Example 3: Job Search Automation

Job hunting can be an exhausting process, involving multiple platforms, browsing job descriptions (JDs), evaluating opportunities, and tracking applications. After repeating these steps too many times, I developed a Python application combined with Selenium for browser automation. This app scrapes job postings, pulls the JDs, and stores the data in a Django backend. I also integrated ChatGPT to compare my resume with the job descriptions, giving me a quick assessment of which positions are a good match. This allows me to filter out irrelevant jobs and focus on those with the highest potential, leaving only the final judgment to manual review.

build the automation

By building your own workflow, you can minimize time wasted on tedious details and concentrate on what truly matters. Whether it’s handling simple everyday tasks or tackling more complex challenges like job hunting, there’s always room to streamline your process.

Updated
Words791
TagsRead2 minutes

Introduction

In a Django application deployed with gunicorn, it is important to optimize the handling of database connections to improve performance and efficiency. This post explores the behavior of gunicorn in relation to database connections and provides insights on how to reuse connections effectively.

Django + Gunicorn

Improtant note about Django ORM is that Django ORM does not have a build-int database connection pool like SQLAlchemy. The lifetime of the persistant connections are controlled by the CONN_MAX_AGE , the default value is 0.

The Sync Worker Class

In the case where CONN_MAX_AGE is set to 0, a new database connection is created for every HTTP request and closed at the end of each request. This can potentially lead to increased overhead due to the constant creation and closure of connections.

However, if CONN_MAX_AGE is set to a value greater than 0, the database connection is not closed immediately after the request is finished. In gunicorn’s sync worker class, which is process-based with a fixed number of processes, subsequent requests will be processed by the same processes, keeping the number of database connections stable.

The Async Worker Class: eventlet and gevent

For the default case where CONN_MAX_AGE is 0, there are no issues as connections are closed at the end of each request.

However, if CONN_MAX_AGE is not 0, problems can arise. In gunicorn’s async worker class, such as eventlet and gevent, each new HTTP request creates a new greenlet thread to handle it. Each thread maintains its own connection, leading to an accumulation of database connections over time.

To tackle this issue, the official gunicorn documentation recommends using the gevent worker and activating greenlet support for Psycopg2 when connecting to PostgreSQL. This can be achieved using the psycogreen library.

References

For more information, refer to the following resources: