babel/old/babel-test: scripts/import

annotate scripts/import_cldr.py @ 51:7f61453c1bea

Fixed a bug regarding plural msgid's handling when writing the `.pot` file. Renamed old `write_po` to `write_pot` which is what it actually does and also adds space to the new `write_po`. Changed tests accordingly. Added support to create new localized catalogs from a catalog template, `write_po`..

author	palgarvio
date	Thu, 07 Jun 2007 22:48:47 +0000
parents	3666f3d3df15
children	7478d663561f

rev	line source
1 f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	1 #!/usr/bin/env python
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	2 # -- coding: utf-8 --
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	3 #
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	4 # Copyright (C) 2007 Edgewall Software
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	5 # All rights reserved.
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	6 #
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	7 # This software is licensed as described in the file COPYING, which
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	8 # you should have received as part of this distribution. The terms
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	9 # are also available at http://babel.edgewall.org/wiki/License.
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	10 #
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	11 # This software consists of voluntary contributions made by many
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	12 # individuals. For the exact contribution history, see the revision
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	13 # history and logs, available at http://babel.edgewall.org/log/.
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	14
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	15 import copy
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	16 from optparse import OptionParser
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	17 import os
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	18 import pickle
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	19 import sys
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	20 try:
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	21 from xml.etree.ElementTree import parse
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	22 except ImportError:
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	23 from elementtree.ElementTree import parse
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	24
9 3be73c6f01f1 Add basic support for number format patterns. jonas parents: 8 diff changeset	25 from babel import dates, numbers
1 f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	26
15 76985c08a339 Minor date formatting improvements. cmlenz parents: 13 diff changeset	27 weekdays = {'mon': 0, 'tue': 1, 'wed': 2, 'thu': 3, 'fri': 4, 'sat': 5,
76985c08a339 Minor date formatting improvements. cmlenz parents: 13 diff changeset	28 'sun': 6}
8 9132c9218745 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	29
9132c9218745 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	30 try:
9132c9218745 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	31 any
9132c9218745 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	32 except NameError:
9132c9218745 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	33 def any(iterable):
9132c9218745 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	34 return filter(None, list(iterable))
9132c9218745 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	35
1 f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	36 def _text(elem):
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	37 buf = [elem.text or '']
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	38 for child in elem:
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	39 buf.append(_text(child))
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	40 buf.append(elem.tail or '')
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	41 return u''.join(filter(None, buf)).strip()
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	42
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	43 def main():
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	44 parser = OptionParser(usage='%prog path/to/cldr')
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	45 options, args = parser.parse_args()
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	46 if len(args) != 1:
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	47 parser.error('incorrect number of arguments')
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	48
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	49 srcdir = args[0]
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	50 destdir = os.path.join(os.path.dirname(os.path.abspath(sys.argv[0])),
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	51 '..', 'babel', 'localedata')
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	52
8 9132c9218745 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	53 sup = parse(os.path.join(srcdir, 'supplemental', 'supplementalData.xml'))
9132c9218745 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	54
9132c9218745 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	55 # build a territory containment mapping for inheritance
9132c9218745 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	56 regions = {}
9132c9218745 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	57 for elem in sup.findall('//territoryContainment/group'):
9132c9218745 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	58 regions[elem.attrib['type']] = elem.attrib['contains'].split()
9132c9218745 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	59
9132c9218745 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	60 # Resolve territory containment
9132c9218745 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	61 territory_containment = {}
9132c9218745 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	62 region_items = regions.items()
9132c9218745 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	63 region_items.sort()
9132c9218745 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	64 for group, territory_list in region_items:
9132c9218745 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	65 for territory in territory_list:
9132c9218745 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	66 containers = territory_containment.setdefault(territory, set([]))
9132c9218745 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	67 if group in territory_containment:
9132c9218745 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	68 containers \|= territory_containment[group]
9132c9218745 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	69 containers.add(group)
9132c9218745 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	70
1 f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	71 filenames = os.listdir(os.path.join(srcdir, 'main'))
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	72 filenames.remove('root.xml')
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	73 filenames.sort(lambda a,b: len(a)-len(b))
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	74 filenames.insert(0, 'root.xml')
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	75
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	76 dicts = {}
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	77
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	78 for filename in filenames:
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	79 print>>sys.stderr, 'Processing input file %r' % filename
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	80 stem, ext = os.path.splitext(filename)
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	81 if ext != '.xml':
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	82 continue
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	83
26 710090104678 * Reduce size of locale data pickles by only storing the data provided by each locale itself, and merging inherited data at runtime. cmlenz parents: 22 diff changeset	84 tree = parse(os.path.join(srcdir, 'main', filename))
1 f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	85 data = {}
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	86
8 9132c9218745 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	87 language = None
9132c9218745 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	88 elem = tree.find('//identity/language')
9132c9218745 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	89 if elem is not None:
9132c9218745 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	90 language = elem.attrib['type']
9132c9218745 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	91 print>>sys.stderr, ' Language: %r' % language
9132c9218745 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	92
9132c9218745 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	93 territory = None
9132c9218745 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	94 elem = tree.find('//identity/territory')
9132c9218745 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	95 if elem is not None:
9132c9218745 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	96 territory = elem.attrib['type']
13 b6c0de43fa40 Extended and documented `LazyProxy`. cmlenz parents: 9 diff changeset	97 else:
b6c0de43fa40 Extended and documented `LazyProxy`. cmlenz parents: 9 diff changeset	98 territory = '001' # world
8 9132c9218745 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	99 print>>sys.stderr, ' Territory: %r' % territory
9132c9218745 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	100 regions = territory_containment.get(territory, [])
9132c9218745 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	101 print>>sys.stderr, ' Regions: %r' % regions
9132c9218745 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	102
1 f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	103 # <localeDisplayNames>
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	104
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	105 territories = data.setdefault('territories', {})
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	106 for elem in tree.findall('//territories/territory'):
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	107 if 'draft' in elem.attrib and elem.attrib['type'] in territories:
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	108 continue
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	109 territories[elem.attrib['type']] = _text(elem)
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	110
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	111 languages = data.setdefault('languages', {})
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	112 for elem in tree.findall('//languages/language'):
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	113 if 'draft' in elem.attrib and elem.attrib['type'] in languages:
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	114 continue
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	115 languages[elem.attrib['type']] = _text(elem)
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	116
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	117 variants = data.setdefault('variants', {})
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	118 for elem in tree.findall('//variants/variant'):
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	119 if 'draft' in elem.attrib and elem.attrib['type'] in variants:
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	120 continue
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	121 variants[elem.attrib['type']] = _text(elem)
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	122
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	123 scripts = data.setdefault('scripts', {})
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	124 for elem in tree.findall('//scripts/script'):
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	125 if 'draft' in elem.attrib and elem.attrib['type'] in scripts:
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	126 continue
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	127 scripts[elem.attrib['type']] = _text(elem)
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	128
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	129 # <dates>
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	130
8 9132c9218745 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	131 week_data = data.setdefault('week_data', {})
9132c9218745 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	132 supelem = sup.find('//weekData')
9132c9218745 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	133
9132c9218745 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	134 for elem in supelem.findall('minDays'):
9132c9218745 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	135 territories = elem.attrib['territories'].split()
9132c9218745 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	136 if territory in territories or any([r in territories for r in regions]):
9132c9218745 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	137 week_data['min_days'] = int(elem.attrib['count'])
9132c9218745 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	138
9132c9218745 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	139 for elem in supelem.findall('firstDay'):
9132c9218745 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	140 territories = elem.attrib['territories'].split()
9132c9218745 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	141 if territory in territories or any([r in territories for r in regions]):
9132c9218745 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	142 week_data['first_day'] = weekdays[elem.attrib['day']]
9132c9218745 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	143
9132c9218745 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	144 for elem in supelem.findall('weekendStart'):
9132c9218745 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	145 territories = elem.attrib['territories'].split()
9132c9218745 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	146 if territory in territories or any([r in territories for r in regions]):
9132c9218745 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	147 week_data['weekend_start'] = weekdays[elem.attrib['day']]
9132c9218745 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	148
9132c9218745 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	149 for elem in supelem.findall('weekendEnd'):
9132c9218745 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	150 territories = elem.attrib['territories'].split()
9132c9218745 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	151 if territory in territories or any([r in territories for r in regions]):
9132c9218745 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	152 week_data['weekend_end'] = weekdays[elem.attrib['day']]
9132c9218745 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	153
1 f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	154 time_zones = data.setdefault('time_zones', {})
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	155 for elem in tree.findall('//timeZoneNames/zone'):
28 11278622ede9 Import basic timezone info from CLDR (see #3). Still missing a couple other pieces in the puzzle. cmlenz parents: 26 diff changeset	156 info = {}
11278622ede9 Import basic timezone info from CLDR (see #3). Still missing a couple other pieces in the puzzle. cmlenz parents: 26 diff changeset	157 city = elem.findtext('exemplarCity')
11278622ede9 Import basic timezone info from CLDR (see #3). Still missing a couple other pieces in the puzzle. cmlenz parents: 26 diff changeset	158 if city:
11278622ede9 Import basic timezone info from CLDR (see #3). Still missing a couple other pieces in the puzzle. cmlenz parents: 26 diff changeset	159 info['city'] = unicode(city)
11278622ede9 Import basic timezone info from CLDR (see #3). Still missing a couple other pieces in the puzzle. cmlenz parents: 26 diff changeset	160 for child in elem.findall('long/*'):
11278622ede9 Import basic timezone info from CLDR (see #3). Still missing a couple other pieces in the puzzle. cmlenz parents: 26 diff changeset	161 info.setdefault('long', {})[child.tag] = unicode(child.text)
11278622ede9 Import basic timezone info from CLDR (see #3). Still missing a couple other pieces in the puzzle. cmlenz parents: 26 diff changeset	162 for child in elem.findall('short/*'):
11278622ede9 Import basic timezone info from CLDR (see #3). Still missing a couple other pieces in the puzzle. cmlenz parents: 26 diff changeset	163 info.setdefault('short', {})[child.tag] = unicode(child.text)
11278622ede9 Import basic timezone info from CLDR (see #3). Still missing a couple other pieces in the puzzle. cmlenz parents: 26 diff changeset	164 time_zones[elem.attrib['type']] = info
1 f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	165
34 3666f3d3df15 Extended time-zone support. cmlenz parents: 33 diff changeset	166 zone_aliases = data.setdefault('zone_aliases', {})
3666f3d3df15 Extended time-zone support. cmlenz parents: 33 diff changeset	167 if stem == 'root':
3666f3d3df15 Extended time-zone support. cmlenz parents: 33 diff changeset	168 for elem in sup.findall('//timezoneData/zoneFormatting/zoneItem'):
3666f3d3df15 Extended time-zone support. cmlenz parents: 33 diff changeset	169 if 'aliases' in elem.attrib:
3666f3d3df15 Extended time-zone support. cmlenz parents: 33 diff changeset	170 canonical_id = elem.attrib['type']
3666f3d3df15 Extended time-zone support. cmlenz parents: 33 diff changeset	171 for alias in elem.attrib['aliases'].split():
3666f3d3df15 Extended time-zone support. cmlenz parents: 33 diff changeset	172 zone_aliases[alias] = canonical_id
3666f3d3df15 Extended time-zone support. cmlenz parents: 33 diff changeset	173
1 f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	174 for calendar in tree.findall('//calendars/calendar'):
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	175 if calendar.attrib['type'] != 'gregorian':
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	176 # TODO: support other calendar types
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	177 continue
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	178
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	179 months = data.setdefault('months', {})
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	180 for ctxt in calendar.findall('months/monthContext'):
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	181 ctxts = months.setdefault(ctxt.attrib['type'], {})
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	182 for width in ctxt.findall('monthWidth'):
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	183 widths = ctxts.setdefault(width.attrib['type'], {})
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	184 for elem in width.findall('month'):
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	185 if 'draft' in elem.attrib and int(elem.attrib['type']) in widths:
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	186 continue
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	187 widths[int(elem.attrib.get('type'))] = unicode(elem.text)
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	188
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	189 days = data.setdefault('days', {})
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	190 for ctxt in calendar.findall('days/dayContext'):
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	191 ctxts = days.setdefault(ctxt.attrib['type'], {})
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	192 for width in ctxt.findall('dayWidth'):
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	193 widths = ctxts.setdefault(width.attrib['type'], {})
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	194 for elem in width.findall('day'):
8 9132c9218745 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	195 dtype = weekdays[elem.attrib['type']]
1 f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	196 if 'draft' in elem.attrib and dtype in widths:
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	197 continue
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	198 widths[dtype] = unicode(elem.text)
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	199
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	200 quarters = data.setdefault('quarters', {})
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	201 for ctxt in calendar.findall('quarters/quarterContext'):
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	202 ctxts = quarters.setdefault(ctxt.attrib['type'], {})
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	203 for width in ctxt.findall('quarterWidth'):
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	204 widths = ctxts.setdefault(width.attrib['type'], {})
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	205 for elem in width.findall('quarter'):
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	206 if 'draft' in elem.attrib and int(elem.attrib['type']) in widths:
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	207 continue
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	208 widths[int(elem.attrib.get('type'))] = unicode(elem.text)
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	209
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	210 eras = data.setdefault('eras', {})
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	211 for width in calendar.findall('eras/*'):
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	212 ewidth = {'eraNames': 'wide', 'eraAbbr': 'abbreviated'}[width.tag]
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	213 widths = eras.setdefault(ewidth, {})
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	214 for elem in width.findall('era'):
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	215 if 'draft' in elem.attrib and int(elem.attrib['type']) in widths:
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	216 continue
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	217 widths[int(elem.attrib.get('type'))] = unicode(elem.text)
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	218
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	219 # AM/PM
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	220 periods = data.setdefault('periods', {})
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	221 for elem in calendar.findall('am'):
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	222 if 'draft' in elem.attrib and elem.tag in periods:
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	223 continue
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	224 periods[elem.tag] = unicode(elem.text)
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	225 for elem in calendar.findall('pm'):
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	226 if 'draft' in elem.attrib and elem.tag in periods:
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	227 continue
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	228 periods[elem.tag] = unicode(elem.text)
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	229
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	230 date_formats = data.setdefault('date_formats', {})
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	231 for elem in calendar.findall('dateFormats/dateFormatLength'):
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	232 if 'draft' in elem.attrib and elem.attrib.get('type') in date_formats:
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	233 continue
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	234 try:
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	235 date_formats[elem.attrib.get('type')] = \
9 3be73c6f01f1 Add basic support for number format patterns. jonas parents: 8 diff changeset	236 dates.parse_pattern(unicode(elem.findtext('dateFormat/pattern')))
1 f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	237 except ValueError, e:
26 710090104678 * Reduce size of locale data pickles by only storing the data provided by each locale itself, and merging inherited data at runtime. cmlenz parents: 22 diff changeset	238 print>>sys.stderr, 'ERROR: %s' % e
1 f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	239
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	240 time_formats = data.setdefault('time_formats', {})
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	241 for elem in calendar.findall('timeFormats/timeFormatLength'):
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	242 if 'draft' in elem.attrib and elem.attrib.get('type') in time_formats:
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	243 continue
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	244 try:
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	245 time_formats[elem.attrib.get('type')] = \
9 3be73c6f01f1 Add basic support for number format patterns. jonas parents: 8 diff changeset	246 dates.parse_pattern(unicode(elem.findtext('timeFormat/pattern')))
1 f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	247 except ValueError, e:
26 710090104678 * Reduce size of locale data pickles by only storing the data provided by each locale itself, and merging inherited data at runtime. cmlenz parents: 22 diff changeset	248 print>>sys.stderr, 'ERROR: %s' % e
1 f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	249
33 0740b6d31799 * Import datetime patterns from CLDR. cmlenz parents: 28 diff changeset	250 datetime_formats = data.setdefault('datetime_formats', {})
0740b6d31799 * Import datetime patterns from CLDR. cmlenz parents: 28 diff changeset	251 for elem in calendar.findall('dateTimeFormats/dateTimeFormatLength'):
0740b6d31799 * Import datetime patterns from CLDR. cmlenz parents: 28 diff changeset	252 if 'draft' in elem.attrib and elem.attrib.get('type') in datetime_formats:
0740b6d31799 * Import datetime patterns from CLDR. cmlenz parents: 28 diff changeset	253 continue
0740b6d31799 * Import datetime patterns from CLDR. cmlenz parents: 28 diff changeset	254 try:
0740b6d31799 * Import datetime patterns from CLDR. cmlenz parents: 28 diff changeset	255 datetime_formats[elem.attrib.get('type')] = \
0740b6d31799 * Import datetime patterns from CLDR. cmlenz parents: 28 diff changeset	256 unicode(elem.findtext('dateTimeFormat/pattern'))
0740b6d31799 * Import datetime patterns from CLDR. cmlenz parents: 28 diff changeset	257 except ValueError, e:
0740b6d31799 * Import datetime patterns from CLDR. cmlenz parents: 28 diff changeset	258 print>>sys.stderr, 'ERROR: %s' % e
0740b6d31799 * Import datetime patterns from CLDR. cmlenz parents: 28 diff changeset	259
1 f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	260 # <numbers>
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	261
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	262 number_symbols = data.setdefault('number_symbols', {})
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	263 for elem in tree.findall('//numbers/symbols/*'):
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	264 number_symbols[elem.tag] = unicode(elem.text)
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	265
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	266 decimal_formats = data.setdefault('decimal_formats', {})
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	267 for elem in tree.findall('//decimalFormats/decimalFormatLength'):
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	268 if 'draft' in elem.attrib and elem.attrib.get('type') in decimal_formats:
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	269 continue
26 710090104678 * Reduce size of locale data pickles by only storing the data provided by each locale itself, and merging inherited data at runtime. cmlenz parents: 22 diff changeset	270 pattern = unicode(elem.findtext('decimalFormat/pattern'))
710090104678 * Reduce size of locale data pickles by only storing the data provided by each locale itself, and merging inherited data at runtime. cmlenz parents: 22 diff changeset	271 decimal_formats[elem.attrib.get('type')] = numbers.parse_pattern(pattern)
1 f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	272
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	273 scientific_formats = data.setdefault('scientific_formats', {})
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	274 for elem in tree.findall('//scientificFormats/scientificFormatLength'):
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	275 if 'draft' in elem.attrib and elem.attrib.get('type') in scientific_formats:
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	276 continue
26 710090104678 * Reduce size of locale data pickles by only storing the data provided by each locale itself, and merging inherited data at runtime. cmlenz parents: 22 diff changeset	277 # FIXME: should use numbers.parse_pattern
1 f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	278 scientific_formats[elem.attrib.get('type')] = unicode(elem.findtext('scientificFormat/pattern'))
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	279
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	280 currency_formats = data.setdefault('currency_formats', {})
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	281 for elem in tree.findall('//currencyFormats/currencyFormatLength'):
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	282 if 'draft' in elem.attrib and elem.attrib.get('type') in currency_formats:
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	283 continue
26 710090104678 * Reduce size of locale data pickles by only storing the data provided by each locale itself, and merging inherited data at runtime. cmlenz parents: 22 diff changeset	284 # FIXME: should use numbers.parse_pattern
1 f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	285 currency_formats[elem.attrib.get('type')] = unicode(elem.findtext('currencyFormat/pattern'))
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	286
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	287 percent_formats = data.setdefault('percent_formats', {})
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	288 for elem in tree.findall('//percentFormats/percentFormatLength'):
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	289 if 'draft' in elem.attrib and elem.attrib.get('type') in percent_formats:
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	290 continue
26 710090104678 * Reduce size of locale data pickles by only storing the data provided by each locale itself, and merging inherited data at runtime. cmlenz parents: 22 diff changeset	291 pattern = unicode(elem.findtext('percentFormat/pattern'))
710090104678 * Reduce size of locale data pickles by only storing the data provided by each locale itself, and merging inherited data at runtime. cmlenz parents: 22 diff changeset	292 percent_formats[elem.attrib.get('type')] = numbers.parse_pattern(pattern)
1 f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	293
26 710090104678 * Reduce size of locale data pickles by only storing the data provided by each locale itself, and merging inherited data at runtime. cmlenz parents: 22 diff changeset	294 currency_names = data.setdefault('currency_names', {})
710090104678 * Reduce size of locale data pickles by only storing the data provided by each locale itself, and merging inherited data at runtime. cmlenz parents: 22 diff changeset	295 currency_symbols = data.setdefault('currency_symbols', {})
1 f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	296 for elem in tree.findall('//currencies/currency'):
26 710090104678 * Reduce size of locale data pickles by only storing the data provided by each locale itself, and merging inherited data at runtime. cmlenz parents: 22 diff changeset	297 name = elem.findtext('displayName')
710090104678 * Reduce size of locale data pickles by only storing the data provided by each locale itself, and merging inherited data at runtime. cmlenz parents: 22 diff changeset	298 if name:
710090104678 * Reduce size of locale data pickles by only storing the data provided by each locale itself, and merging inherited data at runtime. cmlenz parents: 22 diff changeset	299 currency_names[elem.attrib['type']] = unicode(name)
710090104678 * Reduce size of locale data pickles by only storing the data provided by each locale itself, and merging inherited data at runtime. cmlenz parents: 22 diff changeset	300 symbol = elem.findtext('symbol')
710090104678 * Reduce size of locale data pickles by only storing the data provided by each locale itself, and merging inherited data at runtime. cmlenz parents: 22 diff changeset	301 if symbol:
710090104678 * Reduce size of locale data pickles by only storing the data provided by each locale itself, and merging inherited data at runtime. cmlenz parents: 22 diff changeset	302 currency_symbols[elem.attrib['type']] = unicode(symbol)
1 f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	303
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	304 dicts[stem] = data
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	305 outfile = open(os.path.join(destdir, stem + '.dat'), 'wb')
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	306 try:
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	307 pickle.dump(data, outfile, 2)
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	308 finally:
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	309 outfile.close()
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	310
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	311 if __name__ == '__main__':
f71ca60f2a4a Import of initial code base. cmlenz parents: diff changeset	312 main()

Mercurial > babel > old > babel-test

annotate scripts/import_cldr.py @ 51:7f61453c1bea