Personal blog of Miguel Araujo

Django Forms I: Custom fields and widgets in detail

On Thursday September 8th I copresented with Daniel Greenfeld my first talk "Advanced Django Form Usage" at DjangoCon.us 2011. It was my first talk for many reasons. I had never spoken in front of such a large audience before, less in English and this was also my first Python talk. If you haven’t seen the video of the talk and the slides, you probably want to, because this post will try to extend, clarify and recap it somehow. Also this will be my official erratum for things that were not exactly true or right that appeared in the slides

Some people really loved the talk and came after it to comment it with me and ask some questions. By that time, I was already thinking that the topic did deserve its own series of posts. Gabriel Hurley, Django committer and documentation specialist, stated:

Apparently some core developers were asking to use the code examples and a transcript from the talk as a starting point for this task. Well, I wasn’t sure if the transcript would be good enough, many things were left to say, but I agreed with Gabriel that it’s a good starting point to base docs on, which is better than starting from scratch. I still feel this necessity of writing these series of posts, which will hopefully be a better starting point than the transcript. I’ve been extremely busy since I gave the talk, but now I happen to have some spare time, so let’s start.

This first post will start with some advanced things: creating custom form fields, custom widgets and how to get the most out of them. Beware that you need to have a good grasp of Django forms to grok these concepts.

Built-in fields and widgets

Here is a list of available built-in Django form fields and Django widgets. I will cover in the next sections how to create your custom ones, but you should be familiar with these lists, so that you don’t end up reinventing the wheel.

Creating custom form fields

In the different Django projects I have come acrossed through my work experience, I haven’t seen many that implemented custom form fields. This is probably beacuse docs don’t explain this in detail. Django docs have a section called Creating custom fields that claims:

> If the built-in Field classes don’t meet your needs, you can easily create custom Field classes. To do this, just create a subclass of django.forms.Field. Its only requirements are that it implement a clean() method and that its init() method accept the core arguments mentioned above (required, label, initial, widget, help_text).

Well, the truth is that you don’t need to implement a clean() method to create your custom field and the core arguments at the moment are: required, widget, label, initial, help_text, error_messages, show_hidden_initial, validators and localize.

But before writing your first form field, you should fully understand form validation as it plays an important role on how you write your own fields. The docs have a well documented section for this Form and field validation that you should read before going any farther. However, the first time I got to Django and read it, even though I reread it several times, I didn’t quite get the big picture. Not sure why, but I had this feeling that some pieces were missing in the puzzle. Thus, I created for the talk this diagram:

I wanted to show the order of execution of the different methods and add the role that widget plays in this schema. Widgets are part of forms, however they are only mentioned once in the previous section:

> to_python() […] This method accepts the raw value from the widget and returns the converted value.

I honestly think saying “raw value” here is not completely right, because some widgets: MultipleHiddenInput, CheckboxInput, NullBooleanSelect, SelectMultiple, CheckboxSelectMultiple, MultiWidget, SplitDateTimeWidget and SplitHiddenDateTimeWidget don’t return the raw value. They process it and return Python lists or booleans.

Every widget has to have a method named value_from_datadict(data, files, name). data is in fact request.GET/request.POST, files is request.FILES and name is the name of the field to which the widget is attached to. This method usually grabs from data and files what it needs using the name as a key and returns a Python value.

You’ve probably wondered before what type of value clean(value) or to_python(value) gets. Well, the answer is that it depends on the widget. It can get a string, a list or any Python object. Widgets are the first part of building a custom field, so you need to have this in mind.

The advanced reader is probably thinking, so if the widget can do this, what is to_python supposed to do. Remember that to_python converts a “raw value” into a Python value. Well, borders are sometimes thin, I would recommend you to do your main conversion in to_python and code your widgets as data agnostic as possible. As a rule of thumb, you shouldn’t use any non-standard Python datatype in a widget.

Subclassing Field

Subclassing Field has some implications that you should be aware of. Field has the following class-level variables that you should know about:

  • widget: Default widget to use for rendering the field, defaults to TextInput.
  • hidden_widget: Default widget to use when rendering this as “hidden”, defaults to HiddenInput.
  • default_validators: Default list of validators to be run, defaults to an empty list.
  • default_error_messages: Default error messages. Defaults to {'required': _(u'This field is required.'), 'invalid': _(u'Enter a valid value.')}

As you probably imagine, class-level variables are used to define defaults in a field.

First custom field: AMarkField

Your custom fields should live in a file named fields.py. Let’s create a form field, that only accepts an uppercase “A”. We will be subclassing forms.Field, this gets us most of the dirty work done.

class AMarkField(forms.Field):
    default_error_messages = {
        'not_an_a': _(u'you can only input A here! damn!'),
    }

    def to_python(self, value):
        if value in validators.EMPTY_VALUES:
            return None

        if value != 'A':
            raise ValidationError(self.error_messages['not_an_a'])

        return value

If you look carefully we are overwriting the class-level variable default_error_messages. This means we don’t have required or invalid error messages. We are not handling extra parameters in our AMarkField so there is no need to overwrite the constructor. As we didn’t specified anything different, by default we are using TextInput as the widget, so we get a string as value. If the string is not an uppercase we raise a ValidationError using self.error_messages['not_an_a']. self.error_messages is a dictionary that contains a merge of self.default_error_messages and error_messages, an optional dictionary that is used to override the default messages that the field will raise. An example would make this clearer:

class ExampleForm(forms.Form):
    mark = AMarkField(error_messages = {'not_an_a': "Only A please"})

This a basic form that uses AMarkField and is using error_messages to pass an overriden message for not_an_a. That’s why we use self.error_messages, to display overriden messages if there are. The creation of this dictionary is done in the constructor of forms.Field so you don’t have to worry about that. If you happen to overwrite the constructor don’t forget to call the parent constructor too.

Some picky readers will have noticed that I’m doing validation in to_python. The only time you should raise a ValidationError in to_python is when you are coercing the value and an exception occurs. So the right way to do AMarkField would be:

class AMarkField(forms.Field):
    default_error_messages = {
        'not_an_a': _(u'you can only input A here! damn!'),
    }

    def to_python(self, value):
        if value in validators.EMPTY_VALUES:
            return None
        return value

    def validate(self, value):
        if value != 'A':
            raise ValidationError(self.error_messages['not_an_a'])

Note that EMPTY_VALUES is basically the tuple (None, '', [], (), {}).

What a custom field means

In a custom field you can wrap validation and Python coercion of any kind. This is the best practice if you are handling the same unit of information in several forms, what you shouldn’t be doing is this:

class MyModelForm(forms.ModelForm):
    mark = CharField()

    def clean_mark(self):
        mark_value = self.cleaned_data['mark']
        if mark_value is not None and mark_value != 'A':
            raise ValidationError(_(u'you can only input A here! damn!'))

        return mark_value

class ExampleForm(forms.Form):
    mark = CharField()

    def clean_mark(self):
        mark_value = self.cleaned_data['mark']
        if mark_value is not None and mark_value != 'A':
            raise ValidationError(_(u'you can only input A here! damn!'))

        return mark_value

clean_ should be used for validation of a field in the specific context of that form. In the example above we are repeating validation logic, violating the famous Django DRY principle. For sure there are ways to make this look more DRY, like wrapping that logic in a function that gets called. But you will agree with me, that these are mere hacks.

If you construct your fields the right way, you will rarely find a use case in which you need to write a clean_. Normally it is used to avoid creating a form field for validation that is not trivially expressed using validators.

Another example: JSONField

Imagine we want to have a form field that returns a JSON validated list. Here it is:

class JSONField(forms.Field):
    default_error_messages = {
        'invalid': 'This is not valid JSON string'
    }

    def to_python(self, value):
        if value in validators.EMPTY_VALUES:
            return []

        try:
            json = simplejson.loads(value)
        except ValueError:
            raise ValidationError(self.error_messages['invalid'])

        return json

This time we are raising a ValidationError in the right method, because we are handling an exception while doing the coercion of the value.

Adding widgets to the equation

The same we have custom fields, we can have custom widgets and both concepts together turn Django forms into a powerful weapon. Your custom widgets should live in a widgets.py file in one of you apps that compose your project.

In widgets you will usually subclass Widget, Input or MultiWidget. Widget is the baseclass for the rest of the built-in widgets available in Django. Therefore Input and MultiWidget are already subclasses of Widget.

Subclassing Widget

You will rarely will find yourself in a situation in which you need to subclass Widget itself. There are other baseclasses that inherit from it, that will do the job.

Anyway, what you need to know when you subclass Widget is that you will have to overwrite the render(name, value, attrs=None) method, which is in charge of returning a string that represents the widget rendered as HTML. Also, if you overwrite the constructor in your subclass you should handle attrs parameter and pass it to the parent.

The class-level variables that you should know that are attributes of Widget are:

  • is_hidden = False: Determines whether this corresponds to an .
  • needs_multipart_form = False: Determines if the widget needs multipart-encrypted form
  • is_localized = False: Determines if the widget is localized.
  • is_required = False: Dertermines if the widget is required.

Subclassing Input

If you think it carefully most widgets should render <input />, textarea or similar things. If you need to render a customized <input />, you should know that Input is already a subclass of Widget so you get more dirty work done. Input defines a class-level variable named input_type that is used for specifying the type of input <input type="BLABLA" />.

When subclassing Input we get already a render method that will suffice most of the times. Sometimes you will want to add some logic, before calling the parent class render method. The next example is quite basic, but you will probably get the idea of how to subclass Input:

class HTML5Input(Input):
    def __init__(self, type, attrs):
        self.input_type = type
        super(HTML5Input, self).__init__(attrs)

What we are doing here is adding a parameter type that will overwrite class-level variable input_type. This way I’m creating some sort of universal configurable input, therefore now we can use this and do things like:

class ExampleForm(forms.Form):
    user_email = forms.EmailField(
        widget=HTML5Input(type='email', attrs={'class': "emailFields"})
    ) 

This way we will get a fancy HTML5 <input type="email" class="emailFields" /> for our EmailField. Also note that I’m passing an attrs dictionary, that is used for rendering the widget. We will see more on this later. But the dictionary key class turns into the class of the input. This is a typical beginners questions too, how do you set a custom CSS class for your widget?

Of course we could have created our own specific EmailInput widget doing:

class EmailInput(Input):
    input_type = 'email'

Anyway, If you are looking for a set of HTML5 widgets that rock, you should check django-floppyforms, a Django app by Bruno Renié.

Subclassing MultiWidget

MultiWidget is a baseclass for creating composed widgets. First thing you need to understandard is that there is not a 1:1 relation between fields and widgets. You can have a form field, that uses a MultiWidget subclass as its widget. Hopefully you will fully understand what this means by the end of the next example.

Imagine you create an AddressField and you want that field to get the values of several inputs in your form. You need to create a widget that renders several inputs, for example: street, number and zipcode. This is done subclassing MultiWidget.

MultiWidget is constructed using a tuple/list of other widgets. Imagine we call the widget for our field, AddressWidget. AddressWidget could be built using (TextInput, TextInput, TextInput) or (StreetInput, TextInput, ZipCodeInput), if you happen to have those custom fields StreetInput and ZipCodeInput in your app.

When subclassing MultiWidget, there are two main things you need to do:

  • Create a constructor that passes a widgets parameter to the parent’s class constructor. You can name your variable what you want, I use widgets in my example, but remember it needs to be a tuple or a list.
  • You have to implement decompress(value) method.

What’s decompress for?

Every widget in the list of widgets used for building AddressWidget expects a value. Those values are obtained using a list that maps 1:1 to the list of widgets. decompress(value) method is called when value is a string, so it should return a list that maps values to widgets. Why this would happen? Well, when you create a form instance, using initial or instance.

form = ExampleForm(request.POST, initial={'address': "Imaginary Road,2,2039"})

This mean we have a form called ExampleForm which has a form field named address and suppose it’s using AddressWidget, so it gets a value "Imaginary Road,2,2039". Suppose we are using commas to handle the separation of fields in the string. Then decompress should parse this string and return ['Imaginary Road', '2', '2039']. MultiWiddget baseclass code will handle the assignment of those values to the different widgets:

  • StreetInput

Handling render in MultiWidget

In the previous output you probably imagined something like this, right:

Well, for that we currently have three options. I recommend you read the three of them in order:

1. The hard one but customizable

You will have to overwrite render method. Forget about using parent class render, because it returns a string and that string is generated using inline Python. That would mean calling parent render and injecting our labels in the string. Trust me that is not worth the time and effort. I’m showing the code of a render method base on MultiField.render that adds some html labels. This is the code block I crafted for DjangoCon.us presentation. You should not do it this way, unless you are sure about what you are doing.

Please don’t panic, I’ll explain it in detail right away and I will show you a couple easier ways to do this :)

def render(self, name, value, attrs=None):
    # HTML to be added to the output
    widget_labels = [
        '<label for="id_%s">Address: </label>',
        '<label for="id_%s">Number: </label>',
        '<label for="id_%s">ZipCode: </label>'
    ]

    if self.is_localized:
        for widget in self.widgets:
            widget.is_localized = self.is_localized

    # value is a list of values, each corresponding to a widget in self.widgets
    if not isinstance(value, list):
        value = self.decompress(value)

    output = []
    final_attrs = self.build_attrs(attrs)
    id_ = final_attrs.get('id', None)
    for i, widget in enumerate(self.widgets):
        try:
            widget_value = value[i]
        except IndexError:
            widget_value = None
        if id_:
            final_attrs = dict(final_attrs, id='%s_%s' % (id_, i))

        output.append(widget_labels[i] % ('%s_%s' % (name, i)))
        output.append(widget.render(name + '_%s' % i, widget_value, final_attrs))

    return mark_safe(self.format_output(output))

I have added a widget_labels list that contains the labels we want to add in front of the each input. The next three lines set the is_localized attribute of the widgets that compose the AddressWidget if AddressWidget.is_localized is set. If value is not a list, we decompress it to get a list. build_attrs is a Widget method that builds a dictionary out of two: the local attrs and self.attrs (the one passed when instantiating the widget). Using that, I iterate over self.widgets, picking their corresponding value and adding to output pairs of labels and the html rendered by widget. After iterating over all the widgets, I return the output marked as safe and formatted. I will explain what is format_output for later.

Easy? Well, probably not, depends on how familiar you are with Python/Django. Sure this is tedious, not DRY and not elegant. This is probably not what you expected from Django, right? Well, that’s because this is not the right way to do it. The reason I’ve explained this in so little detail is that I needed to state this very clear.

2. Easier but more rigid

If you looked at the previous code MultiWidget has a method named format_output that is called when returning output. output is a list of rendered widgets html as strings. So you could overwrite format_output and inject your labels in that list. Problem is that those labels have to be quite fixed or you need to do some parsing out of the corresponding rendered widgets to extract ids and so on.

def format_output(self, rendered_widgets):
    """
    Given a list of rendered widgets (as strings), returns a Unicode string
    representing the HTML for the whole lot.

    This hook allows you to format the HTML design of the widgets, if
    needed.
    """
    rendered_widgets.insert(0, "<label for="id_address_field_0">Street:</label>)
    rendered_widgets.insert(2, "<label for="id_address_field_1">Number:</label>")
    rendered_widgets.insert(4, "<label for="id_address_field_2">Zip Code:</label>")
    return u''.join(rendered_widgets)

I think the code is easy to follow. You probably agree with me, this is the right way to do it even though you might have to add some parsing to insert customizable strings in rendered_widgets.

3. Third one, using MultiWidgetLayout

As you may know, I’m the lead developer of django-uni-form, so I decided to borrow some of its concepts, creating an alternative to MultiWidget, named MultiWidgetLayout.

The difference is that you can control the rendering of your widget using a simple layout, which is a list of strings and widgets. So you don’t have to specify widgets aside. Using it AddressWidget code turns into this:

from django.utils.datastructures import SortedDict

class AddressWidget(MultiWidgetLayout):
    def __init__(self, attrs=None):
        layout = [ 
            "<label for="%(id)s">Street:</label>", TextInput(),
            "<label for="%(id)s">Number:</label>", TextInput(),
            "<label for="%(id)s">Zip Code:</label>", TextInput()
        ]
        super(AddressWidget, self).__init__(layout, attrs)

    def decompress(self, value):
        if value:
            return value.split(",")
        return [None, None, None]

This will get you the exact same results. Where can you get MultiWidgetLayout? Well, I’ve created a ticket in Django (#16959) to see if the concept is accepted into Django itself, meanwhile you can grab the code from django-multiwidgetlayout.

Custom fields and widgets both together

If you didn’t get lost in this path you may want to know how do you play both together. Remember the diagram I showed you way above? Widgets return Python values and fields have a to_python function that can turn those into different ones. Want to see an example of what this means?

1. AddressField with AddressWidget

This time we play together a custom field and a custom widget and this is what happens:

fields.py

class AddressField(forms.Field):
    self.widget = AddressWidget

    def to_python(self, value):
        # Already gets a Python list
        return value

forms.py

from fields import AddressField

class ExampleForm(forms.Form):
    address = AddressField()

Our cleaned_data for this field would be a Python list: ["SW Sixth Avenue", "921", "Portland"]

2. AddressWidget with a built-in field

If we for example use built-in field CharField with AddressWidget:

forms.py

class ExampleForm(object):
    address = forms.CharField(widget=AddressWidget)

Our cleaned_data for this field would be a Python string: "['SW Sixth Avenue', '921', 'Portland']" Note that this is a string that represents a list, but still a string.

A Field for MultiWidget: MultiValueField

Django developers thought that probably you would like to have the same concept of MultiWidget for fields, creating the baseclass MultiValueField. MultiValueField is a field composed of several form fields. So now imagine you have your AddressWidget that returns the previous list ["SW Sixth Avenue", "921", "Portland"] and you want each value of the list to be validated using fields you have available. It’s easier than you may think, you would do:

class AddressField(forms.MultiValueField):
    widget = AddressWidget

    def __init__(self, *args, **kwargs):
        fields = (forms.CharField(), forms.IntegerField(), forms.CharField())
        super(AlternativeAddressField, self).__init__(fields, *args, **kwargs)

    def compress(self, data_list):
        return data_list

This means that "SW Sixth Avenue" would be validated as a CharField. But maybe you have bigger plans for this. You could create a StreetField that geolocalized the data in validate() and raised errors if found or returned an instance of a class Spot otherwise. Now you have plenty of possibilities and aces up your sleeve. "921" would be validated as an IntegerField and so forth.

One thing you have to do when subclassing MultiValueField is implementing compress. This method is called before returning cleaned_data in clean(). The comments in this method say:

> Returns a single value for the given list of values. The values can be assumed to be valid. > For example, if this MultiValueField was instantiated with fields=(DateField(), TimeField()), this might return a datetime object created by combining the date and time in data_list.

Well, this is not completely true in my humble opinion. You might want to return the list as it is and then handle it as you want later or maybe you do want to compress it. In the example above I just simply return the list as it is: ["SW Sixth Avenue", 921, 'Portland'].

Note that we not only validated the inputs but we cleaned and coerced them using form fields, so now we have a list of [string, integer, string], neat right?

More fields and widgets

Django core developer Carl Meyer put together a Django app named django-form-utils that has some nice extra form fields and widgets that you might be interested in.

Conclusions

Hopefully if you have read the whole article, I will have clarified some of the stuff we went over in the talk quite fast. I cannot promise when next post will come out as I’m busy lately, but I will try to get it out soon.

If you find any bugs, misexplained things or anything else, please comment it below and I will fix it asap. Also feel free to post your opinion or comments.

If you like what I write or work on, you can follow me on Twitter or Github.

blog comments powered by Disqus