view doc/formatting.txt @ 91:5cff450b9ed5 trunk

Changed translator comments extraction behaviour in python source code. Match is now true only if the TAG is on the start of the comment. The TAG will also be stripped from the comment. Added a unittest which tests this.
author palgarvio
date Mon, 11 Jun 2007 22:27:24 +0000
parents 0739bc8e7210
children f5cfd37ec37e
line wrap: on
line source
.. -*- mode: rst; encoding: utf-8 -*-

==========================
Number and Date Formatting
==========================


.. contents:: Contents
   :depth: 2
.. sectnum::


Date Formatting
===============

When working with date and time information in Python, you commonly use the
classes ``date``, ``datetime`` and/or ``time`` from the `datetime package`_.
Babel provides functions for locale-specific formatting of those objects in its
``dates`` module:

.. code-block:: pycon

    >>> from datetime import date, datetime, time
    >>> from babel.dates import format_date, format_datetime, format_time

    >>> d = date(2007, 4, 1)
    >>> format_date(d, locale='en')
    u'Apr 1, 2007'
    >>> format_date(d, locale='de_DE')
    u'01.04.2007'

As this example demonstrates, Babel will automatically choose a date format
that is appropriate for the requested locale.

The ``format_*()`` functions also accept an optional ``format`` argument, which
allows you to choose between one of four format variations:

 * ``short``,
 * ``medium`` (the default),
 * ``long``, and
 * ``full``.

For example:

.. code-block:: pycon

    >>> format_date(d, format='short', locale='en')
    u'4/1/07'
    >>> format_date(d, format='long', locale='en')
    u'April 1, 2007'
    >>> format_date(d, format='full', locale='en')
    u'Sunday, April 1, 2007'

.. _`datetime package`: http://docs.python.org/lib/module-datetime.html


Pattern Syntax
--------------

While Babel makes it simple to use the appropriate date/time format for a given
locale, you can also force it to use custom patterns. Note that Babel uses
different patterns for specifying number and date formats compared to the
Python equivalents (such as ``time.strftime()``), which have mostly been
inherited from C and POSIX. The patterns used in Babel are based on the
`Locale Data Markup Language specification`_ (LDML), which defines them as
follows:

    A date/time pattern is a string of characters, where specific strings of
    characters are replaced with date and time data from a calendar when formatting
    or used to generate data for a calendar when parsing. […]

    Characters may be used multiple times. For example, if ``y`` is used for the
    year, ``yy`` might produce "99", whereas ``yyyy`` produces "1999". For most
    numerical fields, the number of characters specifies the field width. For
    example, if ``h`` is the hour, ``h`` might produce "5", but ``hh`` produces
    "05". For some characters, the count specifies whether an abbreviated or full
    form should be used […]

    Two single quotes represent a literal single quote, either inside or outside
    single quotes. Text within single quotes is not interpreted in any way (except
    for two adjacent single quotes).

For example:

.. code-block:: pycon

    >>> d = date(2007, 4, 1)
    >>> format_date(d, "EEE, MMM d, ''yy", locale='en')
    u"Sun, Apr 1, '07"
    >>> format_date(d, "EEEE, d.M.yyyy", locale='de')
    u'Sonntag, 1.4.2007'

    >>> t = time(15, 30)
    >>> format_time(t, "hh 'o''clock' a", locale='en')
    u"03 o'clock PM"
    >>> format_time(t, 'H:mm a', locale='de')
    u'15:30 nachm.'

    >>> dt = datetime(2007, 4, 1, 15, 30)
    >>> format_datetime(dt, "yyyyy.MMMM.dd GGG hh:mm a", locale='en')
    u'02007.April.01 AD 03:30 PM'

The syntax for custom datetime format patterns is described in detail in the
the `Locale Data Markup Language specification`_. The following table is just a
relatively brief overview.

 .. _`Locale Data Markup Language specification`: http://unicode.org/reports/tr35/#Date_Format_Patterns

-----------
Date Fields
-----------

  +----------+--------+--------------------------------------------------------+
  | Field    | Symbol | Description                                            |
  +==========+========+========================================================+
  | Era      | ``G``  | Replaced with the era string for the current date. One |
  |          |        | to three letters for the abbreviated form, four        |
  |          |        | lettersfor the long form, five for the narrow form     |
  +----------+--------+--------------------------------------------------------+
  | Year     | ``y``  | Replaced by the year. Normally the length specifies    |
  |          |        | the padding, but for two letters it also specifies the |
  |          |        | maximum length.                                        |
  |          +--------+--------------------------------------------------------+
  |          | ``Y``  | Same as ``y`` but uses the ISO year-week calendar.     |
  |          +--------+--------------------------------------------------------+
  |          | ``u``  | ??                                                     |
  +----------+--------+--------------------------------------------------------+
  | Quarter  | ``Q``  | Use one or two for the numerical quarter, three for    |
  |          |        | the abbreviation, or four for the full name.           |
  |          +--------+--------------------------------------------------------+
  |          | ``q``  | Use one or two for the numerical quarter, three for    |
  |          |        | the abbreviation, or four for the full name.           |
  +----------+--------+--------------------------------------------------------+
  | Month    | ``M``  | Use one or two for the numerical month, three for the  |
  |          |        | abbreviation, or four for the full name, or five for   |
  |          |        | the narrow name.                                       |
  |          +--------+--------------------------------------------------------+
  |          | ``L``  | Use one or two for the numerical month, three for the  |
  |          |        | abbreviation, or four for the full name, or 5 for the  |
  |          |        | narrow name.                                           |
  +----------+--------+--------------------------------------------------------+
  | Week     | ``w``  | Week of year.                                          |
  |          +--------+--------------------------------------------------------+
  |          | ``W``  | Week of month.                                         |
  +----------+--------+--------------------------------------------------------+
  | Day      | ``d``  | Day of month.                                          |
  |          +--------+--------------------------------------------------------+
  |          | ``D``  | Day of year.                                           |
  |          +--------+--------------------------------------------------------+
  |          | ``F``  | Day of week in month.                                  |
  |          +--------+--------------------------------------------------------+
  |          | ``g``  | ??                                                     |
  +----------+--------+--------------------------------------------------------+
  | Week day | ``E``  | Day of week. Use one through three letters for the     |
  |          |        | short day, or four for the full name, or five for the  |
  |          |        | narrow name.                                           |
  |          +--------+--------------------------------------------------------+
  |          | ``e``  | Local day of week. Same as E except adds a numeric     |
  |          |        | value that will depend on the local starting day of    |
  |          |        | the week, using one or two letters.                    |
  |          +--------+--------------------------------------------------------+
  |          | ``c``  | ??                                                     |
  +----------+--------+--------------------------------------------------------+

-----------
Time Fields
-----------

  +----------+--------+--------------------------------------------------------+
  | Field    | Symbol | Description                                            |
  +==========+========+========================================================+
  | Period   | ``a``  | AM or PM                                               |
  +----------+--------+--------------------------------------------------------+
  | Hour     | ``h``  | Hour [1-12].                                           |
  |          +--------+--------------------------------------------------------+
  |          | ``H``  | Hour [0-23].                                           |
  |          +--------+--------------------------------------------------------+
  |          | ``K``  | Hour [0-11].                                           |
  |          +--------+--------------------------------------------------------+
  |          | ``k``  | Hour [1-24].                                           |
  +----------+--------+--------------------------------------------------------+
  | Minute   | ``m``  | Use one or two for zero places padding.                |
  +----------+--------+--------------------------------------------------------+
  | Second   | ``s``  | Use one or two for zero places padding.                |
  |          +--------+--------------------------------------------------------+
  |          | ``S``  | Fractional second, rounds to the count of letters.     |
  |          +--------+--------------------------------------------------------+
  |          | ``A``  | Milliseconds in day.                                   |
  +----------+--------+--------------------------------------------------------+ 
  | Timezone | ``z``  | Use one to three letters for the short timezone or     |
  |          |        | four for the full name.                                |
  |          +--------+--------------------------------------------------------+
  |          | ``Z``  | Use one to three letters for RFC 822, four letters for |
  |          |        | GMT format.                                            |
  |          +--------+--------------------------------------------------------+
  |          | ``v``  | Use one letter for short wall (generic) time, four for |
  |          |        | long wall time.                                        |
  +----------+--------+--------------------------------------------------------+


Time-Zone Support
-----------------

Many of the verbose time formats include the time-zone, but time-zone
information is not by default available for the Python ``datetime`` and
``time`` objects. The standard library includes only the abstract ``tzinfo``
class, which you need appropriate implementations for to actually use in your
application. Babel includes a ``tzinfo`` implementation for UTC (Universal
Time).

For real time-zone support, it is strongly recommended that you use the
third-party package `pytz`_, which includes the definitions of practically all
of the time-zones used on the world, as well as important functions for
reliably converting from UTC to local time, and vice versa:

.. code-block:: pycon

    >>> from datetime import time
    >>> from pytz import timezone, utc
    >>> dt = datetime(2007, 04, 01, 15, 30, tzinfo=utc)
    >>> eastern = timezone('US/Eastern')
    >>> format_datetime(dt, 'H:mm Z', tzinfo=eastern, locale='en_US')
    u'11:30 -0400'

The recommended approach to deal with different time-zones in a Python
application is to always use UTC internally, and only convert from/to the users
time-zone when accepting user input and displaying date/time data, respectively.
You can use Babel together with ``pytz`` to apply a time-zone to any
``datetime`` or ``time`` object for display, leaving the original information
unchanged:

.. code-block:: pycon

    >>> british = timezone('Europe/London')
    >>> format_datetime(dt, 'H:mm zzzz', tzinfo=british, locale='en_US')
    u'16:30 British Summer Time'

Here, the given UTC time is adjusted to the "Europe/London" time-zone, and
daylight savings time is taken into account. Daylight savings time is also
applied to ``format_time``, but because the actual date is unknown in that
case, the current day is assumed to determine whether DST or standard time
should be used.

 .. _`pytz`: http://pytz.sourceforge.net/


Parsing Dates
-------------

Babel can also parse date and time information in a locale-sensitive manner:

.. code-block:: pycon

    >>> from babel.dates import parse_date, parse_datetime, parse_time


Number Formatting
=================

Support for locale-specific formatting and parsing of numbers is provided by
the ``babel.numbers`` module:

.. code-block:: pycon

    >>> from babel.numbers import format_number, format_decimal, format_percent

Examples:

.. code-block:: pycon

    >>> format_decimal(1.2345, locale='en_US')
    u'1.234'
    >>> format_decimal(1.2345, locale='sv_SE')
    u'1,234'
    >>> format_decimal(12345, locale='de_DE')
    u'12.345'


Pattern Syntax
--------------

While Babel makes it simple to use the appropriate number format for a given
locale, you can also force it to use custom patterns. As with date/time
formatting patterns, the patterns Babel supports for number formatting are
based on the `Locale Data Markup Language specification`_ (LDML).

Examples:

.. code-block:: pycon

    >>> format_decimal(-1.2345, format='#,##0.##;-#', locale='en')
    u'-1.23'
    >>> format_decimal(-1.2345, format='#,##0.##;(#)', locale='en')
    u'(1.23)'

The syntax for custom number format patterns is described in detail in the
the specification. The following table is just a relatively brief overview.

  +----------+-----------------------------------------------------------------+
  | Symbol   | Description                                                     |
  +==========+=================================================================+
  | ``0``    | Digit                                                           |
  +----------+-----------------------------------------------------------------+
  | ``1-9``  | '1' through '9' indicate rounding.                              |
  +----------+-----------------------------------------------------------------+
  | ``@``    | Significant digit                                               |
  +----------+-----------------------------------------------------------------+
  | ``#``    | Digit, zero shows as absent                                     |
  +----------+-----------------------------------------------------------------+
  | ``.``    | Decimal separator or monetary decimal separator                 |
  +----------+-----------------------------------------------------------------+
  | ``-``    | Minus sign                                                      |
  +----------+-----------------------------------------------------------------+
  | ``,``    | Grouping separator                                              |
  +----------+-----------------------------------------------------------------+
  | ``E``    | Separates mantissa and exponent in scientific notation          |
  +----------+-----------------------------------------------------------------+
  | ``+``    | Prefix positive exponents with localized plus sign              |
  +----------+-----------------------------------------------------------------+
  | ``;``    | Separates positive and negative subpatterns                     |
  +----------+-----------------------------------------------------------------+
  | ``%``    | Multiply by 100 and show as percentage                          |
  +----------+-----------------------------------------------------------------+
  | ``‰``    | Multiply by 1000 and show as per mille                          |
  +----------+-----------------------------------------------------------------+
  | ``¤``    | Currency sign, replaced by currency symbol. If doubled,         |
  |          | replaced by international currency symbol. If tripled, uses the |
  |          | long form of the decimal symbol.                                |
  +----------+-----------------------------------------------------------------+
  | ``'``    | Used to quote special characters in a prefix or suffix          |
  +----------+-----------------------------------------------------------------+
  | ``*``    | Pad escape, precedes pad character                              |
  +----------+-----------------------------------------------------------------+


Parsing Numbers
---------------

Babel can also parse numeric data in a locale-sensitive manner:

.. code-block:: pycon

    >>> from babel.numbers import parse_decimal, parse_number

Examples:

.. code-block:: pycon

    >>> parse_decimal('1,099.98', locale='en_US')
    1099.98
    >>> parse_decimal('1.099,98', locale='de')
    1099.98
    >>> parse_decimal('2,109,998', locale='de')
    Traceback (most recent call last):
      ...
    NumberFormatError: '2,109,998' is not a valid decimal number
Copyright (C) 2012-2017 Edgewall Software