HomeAboutPostsTagsProjectsRSS

Updated
Words1127
TagsRead3 minutes

Notes about Django migration generated SQL

I have recently transitioned a service from Flyway to Django-migration based database management. To ensure a smooth data migration process, I need to verify that the Django-migrations generated DDL is compatible with the existing one.

pytest-django: how to create empty database for test cases

I am using pytest with the pytest-django plugin to write unit tests that compare the generated raw SQLs. I have two test cases, both of them start with empty database, one test case executes the Flyway migration, the other test case applies Django migrations. If both test cases pass the same assertions of the database, for example database contains certain tables, indexes, enum type, constraints, etc.), I can be confident about the Django migration files.

The issue is that pytest-django creates a test database instance with Django migrations executed by default. The –no-migrations option does not create an empty database instance. Instead, it disables Django migration execution and creates tables by inspecting the Django models.

I would like pytest-django to have an option to disable Django migration execution, allowing for an empty database instance to be created. This would enable me to test the compatibility of my Django migration files more effectively.

Solution

The solution is to use a custom django_db_setup fixture for the test cases.

@pytest.fixture
def django_db_setup(django_db_blocker):
    """Custom db setup that creates a new empty test db without any tables."""

    original_db = connection.settings_dict["NAME"]
    test_db = "test_" + uuid.uuid4().hex[:8]

    # First, connect to default database to create test database
    with django_db_blocker.unblock():
        with connection.cursor() as cursor:
            print(f"CREATE DATABASE {test_db}")
            cursor.execute(f"CREATE DATABASE {test_db}")

    # Update connection settings to use test database
    for alias in connections:
        connections[alias].settings_dict["NAME"] = test_db

    # Close all existing connections
    # force new connection to be created with updated settings
    for alias in connections:
        connections[alias].close()

    yield

    # Restore the default database name
    # so it won't affect other tests
    for alias in connections:
        connections[alias].settings_dict["NAME"] = original_db

    # Close all existing connections
    # force new connection to be created with updated settings
    for alias in connections:
        connections[alias].close()

Django generated foreign key with deferrable constraints

While comparing the generated DDL, i noticed that in Django-generated DDL, foreign key constraints has a DEFERRABLE INITIALLY DEFERRED. This constraint means checking is delayed until transaction end.

It allows temporary violations of the foreign key constraint within a transaction, this can be helpful for inserting related records in any order within a transaction.

Django’s ORM is designed to work with deferrable constraints:

  • It can help prevent issues when saving related objects, especially in complex transactions
  • Some Django features (like bulk_create with related objects) work better with deferrable constraints

No Downside for Most Applications:

  • Deferrable constraints still ensure data integrity by the end of each transaction
  • The performance impact is typically negligible
  • If a constraint must be checked immediately, it can still enforce it at the application level

So I keep the Django-generated foreign key constraints and consider following two are equivalent

FOREIGN KEY (manufacturer) REFERENCES organizations(id)

FOREIGN KEY (manufacturer) REFERENCES organizations(id) DEFERRABLE INITIALLY DEFERRED

Updated
Words276
TagsRead1 minute

To my surprise when you search for gleam read file in google, they are not much helpful information in the first page and no code example.

There are a post in Erlang Forums where the author of Gleam language pointed to a module that no longer exists in gleam_erlang pacakge, and a abandoned pacakge call gleam_file, and a couple pacakges like simplifile.

It turns out that Gleam has excellent FFI that if you are running it on BEAM (which is the default option unless you compile Gleam to javascript), for a simple case you just need to import the function from erlang, in just two lines of code.

@external(erlang, "file", "read_file")
fn read_file(file_path: String) -> Result(String, Nil)

and you can use it as a normal function to read the file content into a string.

pub fn read_file_as_string(path: String) {
  use content <- result.try(
    read_file(path)
    |> result.map_error(fn(_) { "Failed to read file: " <> path }),
  )
  content
}

Updated
Words1199
TagsRead3 minutes

2025-02-19-django-grpc-workflow

Django-socio-grpc (DSG) is a framework for using gRPC with Django. It builds upon django-rest-framework (DRF), making it easy for those familiar with DRF to get started with DSG.

Although I decided to go back to DRF after exploring DSG, I chose to do so because I needed to get things done quickly. Using gRPC is considered a potential way to achieve performance gains, and there are some obstacles need to be addressed before going full gRPC. I’m leaving these notes as my learning experience.

The workflow

Using django-socio-grpc (DSG) the workflow is like following:

flowchart TD
  subgraph backend[Backend Side]
	  db-schema[Design Database Schema] --> |create| django-models[Django Models] --> |define| serializers
	  pb-backend[Protocol Buffers]	  
	  django-models --> server-app[Server Application]
	  django-models --> |migrate| database
  end

  subgraph protobuf-repo[Protobuf repository]
	  .proto[.proto] --> |run| buf-lint[buf lint] --> |detect| buf-check[Breaking Change] --> |if pass|protobuf-gen[Generate Code]
	  protobuf-gen --> server-stub[gRPC Server skeleton]
	  protobuf-gen --> client-stub[gRPC Client Stub]
	  protobuf-gen --> mypy-types[.pyi type stub]
  end

  subgraph frontend[Frontend Side]
	  pb-frontend[Protocol Buffers]
		client-app[Client Application] --> |call| client-stub
  end

  serializers --> |DSG generates| .proto

  server-app --> |implement| server-stub
  server-stub --> |serializes| pb-backend

  mypy-types --> server-app

  client-stub --> |serializes| pb-frontend
  pb-backend <--> |binary format over HTTP/2| pb-frontend

Make DSG aware of big integer model fields

Currently DSG (version 0.24.3) has an issue of mapping some model fields to int32 type in protobuf incorrectly due to DRF’s decision, they should be mapped to int64 type.

It’s kind of hard to implement at library level, in application level I implemented something like this, using BigIntAwareModelProtoSerializer as the parent class of the proto serializer will correctly map BigAutoField BigIntegerField PositiveBigIntegerField to int64 type in protobuf.

from django.db import models
from django_socio_grpc import proto_serializers
from rest_framework import serializers

class BigIntegerField(serializers.IntegerField):
    """Indicate that this filed should be converted to int64 on gRPC message.

    This should apply to
    - models.BigAutoField.
    - models.BigIntegerField
    - models.PositiveBigIntegerField

    rest_framework.serializers.ModelSerializer.serializer_field_mapping
    maps django.models.BitIntegerField to serializer.fields.IntegerField.

    Although the value bounds are set correctly, django-socio-grpc can only map it to int32,
    we need to explicitly mark it for django-socio-grpc to convert it to int64.
    """

    proto_type = "int64"

class BigIntAwareModelProtoSerializer(proto_serializers.ModelProtoSerializer):
    @classmethod
    def update_field_mapping(cls):
        # Create a new mapping dictionary inheriting from the base
        field_mapping = dict(getattr(cls, "serializer_field_mapping", {}))

        # Update the mapping for BigInteger fields
        field_mapping.update(
            {
                models.BigIntegerField: BigIntegerField,
                models.BigAutoField: BigIntegerField,
                models.PositiveBigIntegerField: BigIntegerField,
            }
        )

        # Set the modified mapping
        cls.serializer_field_mapping = field_mapping

    def __init_subclass__(cls, **kwargs):
        super().__init_subclass__(**kwargs)
        cls.update_field_mapping()

    """A ModelProtoSerializer that automatically converts Django BigInteger fields to gRPC int64 fields by modifying the field mapping."""

Major obstacles for using gRPC

The biggest obstacle is that browsers do not natively support gRPC, which relies on HTTP/2.0. As a result, client-side frontend calls to backend services from a browser using gRPC require a proxy, typically Envoy. This setup involves additional overhead, such as configuring a dedicated API gateway or setting up an ingress. Even with a service mesh like Istio, some extra work is still necessary.

The next challenge is how to corporate with existing RESTful services if we chose to add a gRPC service. For communications happen between RESTful service and gRPC service, a gRPC-JSON transcoder (for example Enovy ) is need so that HTTP/JSON can be converted to gRPC. Again some extra work is needed at infrastructure level.

The last part of using gRPC is that data is transferred in binary form (which is the whole point of using gRPC for performance) makes it a little bit harder for debugging.

Conclusion

Django-socio-grpc is solid and its documentation is good. However, the major issue is the overhead work that comes with using gRPC. I will consider it again when I need extra performance and my team’s tech stack is adapted to gRPC.

Updated
Words509
TagsRead1 minute

Gleam language: how to find the min element in a list

Gleam language standard library has a list.max to find the maximum element in a list, but to my surprise it doesn’t provide a counterpart list.min function, in order to do that, you have to use compare function with order.negate

import gleam/list
import gleam/int
import gleam/order

pub fn main() {
  let numbers = [5, 3, 8, 1, 9, 2]
  
  // Find the minimum value using list.max with order.negate
  let minimum = list.max(numbers, with: fn(a, b) {
    order.negate(int.compare(a, b))
  })
  
  // Print the result (will be Some(1))
  io.debug(minimum)
}

Another noteworthy aspect is that when a list contains multiple maximum values, list.max returns the last occurrence of the maximum value. This behavior contrasts significantly with Python’s list.max, which returns the first occurrence in such cases. I observed this discrepancy while comparing different implementations in both languages.

  partitions
  |> list.max(fn(a, b) {
    float.compare(a |> expected_entropy, b |> expected_entropy)
    |> order.negate
  })

In the code snippet provided, the result will be the last element in the list that has the minimal entropy value.

Updated
Words284
TagsRead2 minutes

This git principle advocates for a workflow that balances a clear main branch history with efficient feature development.

1. Merge into main:

  • Purpose: Keeps the main branch history clean and linear in terms of releases and major integrations.
  • How it works: When a feature is complete and tested, it’s integrated into main using a merge commit. This explicitly marks the point in time when the feature was incorporated.
  • Benefit: main branch history clearly shows the progression of releases and key integrations, making it easier to track releases and understand project evolution.

2. Rebase feature branches:

  • Purpose: Maintains a clean and linear history within each feature branch and simplifies integration with main.
  • How it works: Before merging a feature branch into main, you rebase it onto the latest main. This replays your feature branch commits on top of the current main, effectively rewriting the feature branch history.
  • Benefit:
    • Linear History: Feature branch history becomes a straight line, easier to understand and review.
    • Clean Merges: Merging a rebased feature branch into main often results in a fast-forward merge (if main hasn’t advanced since the rebase), or a simpler merge commit, as the feature branch is already based on the latest main.
    • Avoids Merge Bubbles: Prevents complex merge histories on feature branches that can arise from frequently merging main into the feature branch.

In essence:

  • main branch: Preserve a clean, chronological, and release-oriented history using merges.
  • Feature branches: Keep them clean and up-to-date with main using rebase to simplify integration and maintain a linear development path within the feature.

Analogy: Imagine main as a clean timeline of major project milestones. Feature branches are like side notes. Rebase neatly integrates those side notes onto the main timeline before officially adding them to the main history via a merge.

Updated
Words2026
TagsRead1 minute
Original code from @XorDev
vec2 p=(FC.xy-r*.5)/r.y*mat2(8,-6,6,8),v;for(float i,f=3.+snoise2D(p+vec2(t*7.,0));i++<50.;o+=(cos(sin(i)*vec4(1,2,3,1))+1.)*exp(sin(i*i+t))/length(max(v,vec2(v.x*f*.02,v.y))))v=p+cos(i*i+(t+p.x*.1)*.03+i*vec2(11,9))*5.;o=tanh(pow(o/1e2,vec4(1.5)));

Updated
Words227
TagsRead1 minute

TIL Python use the integer itself as the hash value, except for -1. hash value for -1 is -2.

# For ordinary integers, the hash value is simply the integer itself (unless it's -1).
class int:
    def hash_(self):

        value = self
        if value == -1:
            value == -2
        return value

source

Updated
Words154
TagsRead1 minute

Auto-venv is a Fish shell script that automatically activates and deactivates Python virtual environments when entering/leaving directory that contains virtual environment.

Recently, I added multiple enhancements compare to the upstream version, now it handles edge cases more gracefully:

  • It safely manages virtual environment inheritance in new shell sessions.
  • It prevents shell exits during the activation and deactivation processes.

Updated
Words622
TagsRead2 minutes

A raycast script command to start lowfi

I recently create a Raycast script command to start the lowfi command effortlessly, so I can enjoy the lowfi music in no time and control it just using keyboard.

Here is the gist , just place the lowfi.sh to the Raycast Script Directory . Run it the first time, a wezterm window will be created if lowfi process isn’t running, run it the second time, the wezterm window will be brought to the front.

Problem of using /opt/hombrew/bin/wezterm

While i thought it was a simple task, it tooks me 1 hour to finished it. I encountered a totally unexpected problem: osascript can’t control the wezterm window that running lowfi process.

I installed wezterm using homebrew cask, when launching WezTerm using the Homebrew-installed binary (/opt/homebrew/bin/wezterm), the window was created with a NULL bundleID, which made it impossible for AppleScript/System Events to properly control it. This is because the Homebrew version doesn’t properly register itself with macOS’s window management system.

I was only able to debug this problem thanks to the amazing aerospace command

aerospace debug-windows

The solution was to always use the full application path /Applications/WezTerm.app/Contents/MacOS/wezterm for all WezTerm operations, ensuring proper window management integration with macOS.

When launching WezTerm using the full application path (/Applications/WezTerm.app/Contents/MacOS/wezterm), the window is created with the proper bundleID com.github.wez.wezterm. This allows:

  1. System Events to properly identify and control the window
  2. AppleScript to manipulate the window through accessibility actions (AXRaise, AXMain)
  3. Proper window focusing and bringing to front