annotate scripts/import_cldr.py @ 393:dd0df1242ae3

Fixed a bug in plural.py that caused a traceback for some locales, changed the `__mod__` DateTimePattern to not raise exceptions but return NotImplemented.
author aronacher
date Mon, 14 Jul 2008 22:18:39 +0000
parents 34c0a25b1ed7
children 17515f1efee0
rev   line source
3
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
1 #!/usr/bin/env python
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
2 # -*- coding: utf-8 -*-
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
3 #
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
4 # Copyright (C) 2007 Edgewall Software
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
5 # All rights reserved.
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
6 #
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
7 # This software is licensed as described in the file COPYING, which
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
8 # you should have received as part of this distribution. The terms
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
9 # are also available at http://babel.edgewall.org/wiki/License.
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
10 #
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
11 # This software consists of voluntary contributions made by many
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
12 # individuals. For the exact contribution history, see the revision
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
13 # history and logs, available at http://babel.edgewall.org/log/.
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
14
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
15 import copy
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
16 from optparse import OptionParser
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
17 import os
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
18 import pickle
377
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
19 import re
3
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
20 import sys
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
21 try:
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
22 from xml.etree.ElementTree import parse
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
23 except ImportError:
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
24 from elementtree.ElementTree import parse
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
25
67
ad48b95af0d9 Add Babel soruce path to CLDR import script automatically for asmodai ;-).
cmlenz
parents: 36
diff changeset
26 # Make sure we're using Babel source, and not some previously installed version
ad48b95af0d9 Add Babel soruce path to CLDR import script automatically for asmodai ;-).
cmlenz
parents: 36
diff changeset
27 sys.path.insert(0, os.path.join(os.path.dirname(sys.argv[0]), '..'))
ad48b95af0d9 Add Babel soruce path to CLDR import script automatically for asmodai ;-).
cmlenz
parents: 36
diff changeset
28
11
11f64b232b04 Add basic support for number format patterns.
jonas
parents: 10
diff changeset
29 from babel import dates, numbers
392
34c0a25b1ed7 Preliminary support for timedelta formatting (see #126), and import/expose the locale plural rules from the CLDR.
cmlenz
parents: 387
diff changeset
30 from babel.plural import PluralRule
377
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
31 from babel.localedata import Alias
3
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
32
17
aa33ad077d24 Minor date formatting improvements.
cmlenz
parents: 15
diff changeset
33 weekdays = {'mon': 0, 'tue': 1, 'wed': 2, 'thu': 3, 'fri': 4, 'sat': 5,
aa33ad077d24 Minor date formatting improvements.
cmlenz
parents: 15
diff changeset
34 'sun': 6}
10
0ca5dd65594f Pull in some supplemental data from the CLDR, for things like the first day of the week.
cmlenz
parents: 3
diff changeset
35
0ca5dd65594f Pull in some supplemental data from the CLDR, for things like the first day of the week.
cmlenz
parents: 3
diff changeset
36 try:
0ca5dd65594f Pull in some supplemental data from the CLDR, for things like the first day of the week.
cmlenz
parents: 3
diff changeset
37 any
0ca5dd65594f Pull in some supplemental data from the CLDR, for things like the first day of the week.
cmlenz
parents: 3
diff changeset
38 except NameError:
0ca5dd65594f Pull in some supplemental data from the CLDR, for things like the first day of the week.
cmlenz
parents: 3
diff changeset
39 def any(iterable):
0ca5dd65594f Pull in some supplemental data from the CLDR, for things like the first day of the week.
cmlenz
parents: 3
diff changeset
40 return filter(None, list(iterable))
0ca5dd65594f Pull in some supplemental data from the CLDR, for things like the first day of the week.
cmlenz
parents: 3
diff changeset
41
377
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
42
3
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
43 def _text(elem):
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
44 buf = [elem.text or '']
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
45 for child in elem:
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
46 buf.append(_text(child))
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
47 buf.append(elem.tail or '')
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
48 return u''.join(filter(None, buf)).strip()
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
49
377
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
50
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
51 NAME_RE = re.compile(r"^\w+$")
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
52 TYPE_ATTR_RE = re.compile(r"^\w+\[@type='(.*?)'\]$")
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
53
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
54 NAME_MAP = {
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
55 'dateFormats': 'date_formats',
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
56 'dateTimeFormats': 'datetime_formats',
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
57 'eraAbbr': 'abbreviated',
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
58 'eraNames': 'wide',
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
59 'eraNarrow': 'narrow',
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
60 'timeFormats': 'time_formats'
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
61 }
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
62
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
63 def _translate_alias(ctxt, path):
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
64 parts = path.split('/')
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
65 keys = ctxt[:]
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
66 for part in parts:
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
67 if part == '..':
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
68 keys.pop()
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
69 else:
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
70 match = TYPE_ATTR_RE.match(part)
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
71 if match:
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
72 keys.append(match.group(1))
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
73 else:
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
74 assert NAME_RE.match(part)
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
75 keys.append(NAME_MAP.get(part, part))
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
76 return keys
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
77
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
78
3
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
79 def main():
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
80 parser = OptionParser(usage='%prog path/to/cldr')
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
81 options, args = parser.parse_args()
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
82 if len(args) != 1:
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
83 parser.error('incorrect number of arguments')
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
84
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
85 srcdir = args[0]
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
86 destdir = os.path.join(os.path.dirname(os.path.abspath(sys.argv[0])),
235
d0cd235ede46 Upgraded to CLDR 1.5 and improved timezone formatting.
cmlenz
parents: 127
diff changeset
87 '..', 'babel')
3
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
88
10
0ca5dd65594f Pull in some supplemental data from the CLDR, for things like the first day of the week.
cmlenz
parents: 3
diff changeset
89 sup = parse(os.path.join(srcdir, 'supplemental', 'supplementalData.xml'))
0ca5dd65594f Pull in some supplemental data from the CLDR, for things like the first day of the week.
cmlenz
parents: 3
diff changeset
90
347
c22f292731be Update to CLDR 1.5.1, which split out the metazone mappings into a separate supplemental file.
cmlenz
parents: 235
diff changeset
91 # Import global data from the supplemental files
235
d0cd235ede46 Upgraded to CLDR 1.5 and improved timezone formatting.
cmlenz
parents: 127
diff changeset
92 global_data = {}
d0cd235ede46 Upgraded to CLDR 1.5 and improved timezone formatting.
cmlenz
parents: 127
diff changeset
93
d0cd235ede46 Upgraded to CLDR 1.5 and improved timezone formatting.
cmlenz
parents: 127
diff changeset
94 territory_zones = global_data.setdefault('territory_zones', {})
d0cd235ede46 Upgraded to CLDR 1.5 and improved timezone formatting.
cmlenz
parents: 127
diff changeset
95 zone_aliases = global_data.setdefault('zone_aliases', {})
d0cd235ede46 Upgraded to CLDR 1.5 and improved timezone formatting.
cmlenz
parents: 127
diff changeset
96 zone_territories = global_data.setdefault('zone_territories', {})
d0cd235ede46 Upgraded to CLDR 1.5 and improved timezone formatting.
cmlenz
parents: 127
diff changeset
97 for elem in sup.findall('//timezoneData/zoneFormatting/zoneItem'):
d0cd235ede46 Upgraded to CLDR 1.5 and improved timezone formatting.
cmlenz
parents: 127
diff changeset
98 tzid = elem.attrib['type']
d0cd235ede46 Upgraded to CLDR 1.5 and improved timezone formatting.
cmlenz
parents: 127
diff changeset
99 territory_zones.setdefault(elem.attrib['territory'], []).append(tzid)
d0cd235ede46 Upgraded to CLDR 1.5 and improved timezone formatting.
cmlenz
parents: 127
diff changeset
100 zone_territories[tzid] = elem.attrib['territory']
d0cd235ede46 Upgraded to CLDR 1.5 and improved timezone formatting.
cmlenz
parents: 127
diff changeset
101 if 'aliases' in elem.attrib:
d0cd235ede46 Upgraded to CLDR 1.5 and improved timezone formatting.
cmlenz
parents: 127
diff changeset
102 for alias in elem.attrib['aliases'].split():
d0cd235ede46 Upgraded to CLDR 1.5 and improved timezone formatting.
cmlenz
parents: 127
diff changeset
103 zone_aliases[alias] = tzid
d0cd235ede46 Upgraded to CLDR 1.5 and improved timezone formatting.
cmlenz
parents: 127
diff changeset
104
347
c22f292731be Update to CLDR 1.5.1, which split out the metazone mappings into a separate supplemental file.
cmlenz
parents: 235
diff changeset
105 # Import Metazone mapping
c22f292731be Update to CLDR 1.5.1, which split out the metazone mappings into a separate supplemental file.
cmlenz
parents: 235
diff changeset
106 meta_zones = global_data.setdefault('meta_zones', {})
c22f292731be Update to CLDR 1.5.1, which split out the metazone mappings into a separate supplemental file.
cmlenz
parents: 235
diff changeset
107 tzsup = parse(os.path.join(srcdir, 'supplemental', 'metazoneInfo.xml'))
c22f292731be Update to CLDR 1.5.1, which split out the metazone mappings into a separate supplemental file.
cmlenz
parents: 235
diff changeset
108 for elem in tzsup.findall('//timezone'):
c22f292731be Update to CLDR 1.5.1, which split out the metazone mappings into a separate supplemental file.
cmlenz
parents: 235
diff changeset
109 for child in elem.findall('usesMetazone'):
c22f292731be Update to CLDR 1.5.1, which split out the metazone mappings into a separate supplemental file.
cmlenz
parents: 235
diff changeset
110 if 'to' not in child.attrib: # FIXME: support old mappings
c22f292731be Update to CLDR 1.5.1, which split out the metazone mappings into a separate supplemental file.
cmlenz
parents: 235
diff changeset
111 meta_zones[elem.attrib['type']] = child.attrib['mzone']
c22f292731be Update to CLDR 1.5.1, which split out the metazone mappings into a separate supplemental file.
cmlenz
parents: 235
diff changeset
112
235
d0cd235ede46 Upgraded to CLDR 1.5 and improved timezone formatting.
cmlenz
parents: 127
diff changeset
113 outfile = open(os.path.join(destdir, 'global.dat'), 'wb')
d0cd235ede46 Upgraded to CLDR 1.5 and improved timezone formatting.
cmlenz
parents: 127
diff changeset
114 try:
d0cd235ede46 Upgraded to CLDR 1.5 and improved timezone formatting.
cmlenz
parents: 127
diff changeset
115 pickle.dump(global_data, outfile, 2)
d0cd235ede46 Upgraded to CLDR 1.5 and improved timezone formatting.
cmlenz
parents: 127
diff changeset
116 finally:
d0cd235ede46 Upgraded to CLDR 1.5 and improved timezone formatting.
cmlenz
parents: 127
diff changeset
117 outfile.close()
d0cd235ede46 Upgraded to CLDR 1.5 and improved timezone formatting.
cmlenz
parents: 127
diff changeset
118
10
0ca5dd65594f Pull in some supplemental data from the CLDR, for things like the first day of the week.
cmlenz
parents: 3
diff changeset
119 # build a territory containment mapping for inheritance
0ca5dd65594f Pull in some supplemental data from the CLDR, for things like the first day of the week.
cmlenz
parents: 3
diff changeset
120 regions = {}
0ca5dd65594f Pull in some supplemental data from the CLDR, for things like the first day of the week.
cmlenz
parents: 3
diff changeset
121 for elem in sup.findall('//territoryContainment/group'):
0ca5dd65594f Pull in some supplemental data from the CLDR, for things like the first day of the week.
cmlenz
parents: 3
diff changeset
122 regions[elem.attrib['type']] = elem.attrib['contains'].split()
0ca5dd65594f Pull in some supplemental data from the CLDR, for things like the first day of the week.
cmlenz
parents: 3
diff changeset
123
0ca5dd65594f Pull in some supplemental data from the CLDR, for things like the first day of the week.
cmlenz
parents: 3
diff changeset
124 # Resolve territory containment
0ca5dd65594f Pull in some supplemental data from the CLDR, for things like the first day of the week.
cmlenz
parents: 3
diff changeset
125 territory_containment = {}
0ca5dd65594f Pull in some supplemental data from the CLDR, for things like the first day of the week.
cmlenz
parents: 3
diff changeset
126 region_items = regions.items()
0ca5dd65594f Pull in some supplemental data from the CLDR, for things like the first day of the week.
cmlenz
parents: 3
diff changeset
127 region_items.sort()
0ca5dd65594f Pull in some supplemental data from the CLDR, for things like the first day of the week.
cmlenz
parents: 3
diff changeset
128 for group, territory_list in region_items:
0ca5dd65594f Pull in some supplemental data from the CLDR, for things like the first day of the week.
cmlenz
parents: 3
diff changeset
129 for territory in territory_list:
0ca5dd65594f Pull in some supplemental data from the CLDR, for things like the first day of the week.
cmlenz
parents: 3
diff changeset
130 containers = territory_containment.setdefault(territory, set([]))
0ca5dd65594f Pull in some supplemental data from the CLDR, for things like the first day of the week.
cmlenz
parents: 3
diff changeset
131 if group in territory_containment:
0ca5dd65594f Pull in some supplemental data from the CLDR, for things like the first day of the week.
cmlenz
parents: 3
diff changeset
132 containers |= territory_containment[group]
0ca5dd65594f Pull in some supplemental data from the CLDR, for things like the first day of the week.
cmlenz
parents: 3
diff changeset
133 containers.add(group)
0ca5dd65594f Pull in some supplemental data from the CLDR, for things like the first day of the week.
cmlenz
parents: 3
diff changeset
134
392
34c0a25b1ed7 Preliminary support for timedelta formatting (see #126), and import/expose the locale plural rules from the CLDR.
cmlenz
parents: 387
diff changeset
135 # prepare the per-locale plural rules definitions
34c0a25b1ed7 Preliminary support for timedelta formatting (see #126), and import/expose the locale plural rules from the CLDR.
cmlenz
parents: 387
diff changeset
136 plural_rules = {}
34c0a25b1ed7 Preliminary support for timedelta formatting (see #126), and import/expose the locale plural rules from the CLDR.
cmlenz
parents: 387
diff changeset
137 prsup = parse(os.path.join(srcdir, 'supplemental', 'plurals.xml'))
34c0a25b1ed7 Preliminary support for timedelta formatting (see #126), and import/expose the locale plural rules from the CLDR.
cmlenz
parents: 387
diff changeset
138 for elem in prsup.findall('//plurals/pluralRules'):
34c0a25b1ed7 Preliminary support for timedelta formatting (see #126), and import/expose the locale plural rules from the CLDR.
cmlenz
parents: 387
diff changeset
139 rules = []
34c0a25b1ed7 Preliminary support for timedelta formatting (see #126), and import/expose the locale plural rules from the CLDR.
cmlenz
parents: 387
diff changeset
140 for rule in elem.findall('pluralRule'):
34c0a25b1ed7 Preliminary support for timedelta formatting (see #126), and import/expose the locale plural rules from the CLDR.
cmlenz
parents: 387
diff changeset
141 rules.append((rule.attrib['count'], unicode(rule.text)))
34c0a25b1ed7 Preliminary support for timedelta formatting (see #126), and import/expose the locale plural rules from the CLDR.
cmlenz
parents: 387
diff changeset
142 pr = PluralRule(rules)
34c0a25b1ed7 Preliminary support for timedelta formatting (see #126), and import/expose the locale plural rules from the CLDR.
cmlenz
parents: 387
diff changeset
143 for locale in elem.attrib['locales'].split():
34c0a25b1ed7 Preliminary support for timedelta formatting (see #126), and import/expose the locale plural rules from the CLDR.
cmlenz
parents: 387
diff changeset
144 plural_rules[locale] = pr
34c0a25b1ed7 Preliminary support for timedelta formatting (see #126), and import/expose the locale plural rules from the CLDR.
cmlenz
parents: 387
diff changeset
145
3
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
146 filenames = os.listdir(os.path.join(srcdir, 'main'))
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
147 filenames.remove('root.xml')
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
148 filenames.sort(lambda a,b: len(a)-len(b))
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
149 filenames.insert(0, 'root.xml')
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
150
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
151 for filename in filenames:
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
152 stem, ext = os.path.splitext(filename)
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
153 if ext != '.xml':
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
154 continue
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
155
387
88e3589ca8df Improve CLDR import of currency-related data to ignore unsupported features such as symbol choice patterns and pluralized display names. See #93.
cmlenz
parents: 377
diff changeset
156 print>>sys.stderr, 'Processing input file %r' % filename
28
695884591af6 * Reduce size of locale data pickles by only storing the data provided by each locale itself, and merging inherited data at runtime.
cmlenz
parents: 24
diff changeset
157 tree = parse(os.path.join(srcdir, 'main', filename))
3
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
158 data = {}
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
159
10
0ca5dd65594f Pull in some supplemental data from the CLDR, for things like the first day of the week.
cmlenz
parents: 3
diff changeset
160 language = None
0ca5dd65594f Pull in some supplemental data from the CLDR, for things like the first day of the week.
cmlenz
parents: 3
diff changeset
161 elem = tree.find('//identity/language')
0ca5dd65594f Pull in some supplemental data from the CLDR, for things like the first day of the week.
cmlenz
parents: 3
diff changeset
162 if elem is not None:
0ca5dd65594f Pull in some supplemental data from the CLDR, for things like the first day of the week.
cmlenz
parents: 3
diff changeset
163 language = elem.attrib['type']
0ca5dd65594f Pull in some supplemental data from the CLDR, for things like the first day of the week.
cmlenz
parents: 3
diff changeset
164 print>>sys.stderr, ' Language: %r' % language
0ca5dd65594f Pull in some supplemental data from the CLDR, for things like the first day of the week.
cmlenz
parents: 3
diff changeset
165
0ca5dd65594f Pull in some supplemental data from the CLDR, for things like the first day of the week.
cmlenz
parents: 3
diff changeset
166 territory = None
0ca5dd65594f Pull in some supplemental data from the CLDR, for things like the first day of the week.
cmlenz
parents: 3
diff changeset
167 elem = tree.find('//identity/territory')
0ca5dd65594f Pull in some supplemental data from the CLDR, for things like the first day of the week.
cmlenz
parents: 3
diff changeset
168 if elem is not None:
0ca5dd65594f Pull in some supplemental data from the CLDR, for things like the first day of the week.
cmlenz
parents: 3
diff changeset
169 territory = elem.attrib['type']
15
b47c34d42eda Extended and documented `LazyProxy`.
cmlenz
parents: 11
diff changeset
170 else:
b47c34d42eda Extended and documented `LazyProxy`.
cmlenz
parents: 11
diff changeset
171 territory = '001' # world
10
0ca5dd65594f Pull in some supplemental data from the CLDR, for things like the first day of the week.
cmlenz
parents: 3
diff changeset
172 print>>sys.stderr, ' Territory: %r' % territory
0ca5dd65594f Pull in some supplemental data from the CLDR, for things like the first day of the week.
cmlenz
parents: 3
diff changeset
173 regions = territory_containment.get(territory, [])
0ca5dd65594f Pull in some supplemental data from the CLDR, for things like the first day of the week.
cmlenz
parents: 3
diff changeset
174 print>>sys.stderr, ' Regions: %r' % regions
0ca5dd65594f Pull in some supplemental data from the CLDR, for things like the first day of the week.
cmlenz
parents: 3
diff changeset
175
392
34c0a25b1ed7 Preliminary support for timedelta formatting (see #126), and import/expose the locale plural rules from the CLDR.
cmlenz
parents: 387
diff changeset
176 # plural rules
34c0a25b1ed7 Preliminary support for timedelta formatting (see #126), and import/expose the locale plural rules from the CLDR.
cmlenz
parents: 387
diff changeset
177 locale_id = '_'.join(filter(None, [
34c0a25b1ed7 Preliminary support for timedelta formatting (see #126), and import/expose the locale plural rules from the CLDR.
cmlenz
parents: 387
diff changeset
178 language,
34c0a25b1ed7 Preliminary support for timedelta formatting (see #126), and import/expose the locale plural rules from the CLDR.
cmlenz
parents: 387
diff changeset
179 territory != '001' and territory or None
34c0a25b1ed7 Preliminary support for timedelta formatting (see #126), and import/expose the locale plural rules from the CLDR.
cmlenz
parents: 387
diff changeset
180 ]))
34c0a25b1ed7 Preliminary support for timedelta formatting (see #126), and import/expose the locale plural rules from the CLDR.
cmlenz
parents: 387
diff changeset
181 if locale_id in plural_rules:
34c0a25b1ed7 Preliminary support for timedelta formatting (see #126), and import/expose the locale plural rules from the CLDR.
cmlenz
parents: 387
diff changeset
182 data['plural_form'] = plural_rules[locale_id]
34c0a25b1ed7 Preliminary support for timedelta formatting (see #126), and import/expose the locale plural rules from the CLDR.
cmlenz
parents: 387
diff changeset
183
3
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
184 # <localeDisplayNames>
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
185
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
186 territories = data.setdefault('territories', {})
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
187 for elem in tree.findall('//territories/territory'):
377
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
188 if ('draft' in elem.attrib or 'alt' in elem.attrib) \
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
189 and elem.attrib['type'] in territories:
3
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
190 continue
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
191 territories[elem.attrib['type']] = _text(elem)
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
192
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
193 languages = data.setdefault('languages', {})
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
194 for elem in tree.findall('//languages/language'):
377
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
195 if ('draft' in elem.attrib or 'alt' in elem.attrib) \
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
196 and elem.attrib['type'] in languages:
3
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
197 continue
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
198 languages[elem.attrib['type']] = _text(elem)
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
199
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
200 variants = data.setdefault('variants', {})
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
201 for elem in tree.findall('//variants/variant'):
377
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
202 if ('draft' in elem.attrib or 'alt' in elem.attrib) \
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
203 and elem.attrib['type'] in variants:
3
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
204 continue
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
205 variants[elem.attrib['type']] = _text(elem)
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
206
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
207 scripts = data.setdefault('scripts', {})
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
208 for elem in tree.findall('//scripts/script'):
377
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
209 if ('draft' in elem.attrib or 'alt' in elem.attrib) \
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
210 and elem.attrib['type'] in scripts:
3
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
211 continue
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
212 scripts[elem.attrib['type']] = _text(elem)
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
213
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
214 # <dates>
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
215
10
0ca5dd65594f Pull in some supplemental data from the CLDR, for things like the first day of the week.
cmlenz
parents: 3
diff changeset
216 week_data = data.setdefault('week_data', {})
0ca5dd65594f Pull in some supplemental data from the CLDR, for things like the first day of the week.
cmlenz
parents: 3
diff changeset
217 supelem = sup.find('//weekData')
0ca5dd65594f Pull in some supplemental data from the CLDR, for things like the first day of the week.
cmlenz
parents: 3
diff changeset
218
0ca5dd65594f Pull in some supplemental data from the CLDR, for things like the first day of the week.
cmlenz
parents: 3
diff changeset
219 for elem in supelem.findall('minDays'):
0ca5dd65594f Pull in some supplemental data from the CLDR, for things like the first day of the week.
cmlenz
parents: 3
diff changeset
220 territories = elem.attrib['territories'].split()
0ca5dd65594f Pull in some supplemental data from the CLDR, for things like the first day of the week.
cmlenz
parents: 3
diff changeset
221 if territory in territories or any([r in territories for r in regions]):
0ca5dd65594f Pull in some supplemental data from the CLDR, for things like the first day of the week.
cmlenz
parents: 3
diff changeset
222 week_data['min_days'] = int(elem.attrib['count'])
0ca5dd65594f Pull in some supplemental data from the CLDR, for things like the first day of the week.
cmlenz
parents: 3
diff changeset
223
0ca5dd65594f Pull in some supplemental data from the CLDR, for things like the first day of the week.
cmlenz
parents: 3
diff changeset
224 for elem in supelem.findall('firstDay'):
0ca5dd65594f Pull in some supplemental data from the CLDR, for things like the first day of the week.
cmlenz
parents: 3
diff changeset
225 territories = elem.attrib['territories'].split()
0ca5dd65594f Pull in some supplemental data from the CLDR, for things like the first day of the week.
cmlenz
parents: 3
diff changeset
226 if territory in territories or any([r in territories for r in regions]):
0ca5dd65594f Pull in some supplemental data from the CLDR, for things like the first day of the week.
cmlenz
parents: 3
diff changeset
227 week_data['first_day'] = weekdays[elem.attrib['day']]
0ca5dd65594f Pull in some supplemental data from the CLDR, for things like the first day of the week.
cmlenz
parents: 3
diff changeset
228
0ca5dd65594f Pull in some supplemental data from the CLDR, for things like the first day of the week.
cmlenz
parents: 3
diff changeset
229 for elem in supelem.findall('weekendStart'):
0ca5dd65594f Pull in some supplemental data from the CLDR, for things like the first day of the week.
cmlenz
parents: 3
diff changeset
230 territories = elem.attrib['territories'].split()
0ca5dd65594f Pull in some supplemental data from the CLDR, for things like the first day of the week.
cmlenz
parents: 3
diff changeset
231 if territory in territories or any([r in territories for r in regions]):
0ca5dd65594f Pull in some supplemental data from the CLDR, for things like the first day of the week.
cmlenz
parents: 3
diff changeset
232 week_data['weekend_start'] = weekdays[elem.attrib['day']]
0ca5dd65594f Pull in some supplemental data from the CLDR, for things like the first day of the week.
cmlenz
parents: 3
diff changeset
233
0ca5dd65594f Pull in some supplemental data from the CLDR, for things like the first day of the week.
cmlenz
parents: 3
diff changeset
234 for elem in supelem.findall('weekendEnd'):
0ca5dd65594f Pull in some supplemental data from the CLDR, for things like the first day of the week.
cmlenz
parents: 3
diff changeset
235 territories = elem.attrib['territories'].split()
0ca5dd65594f Pull in some supplemental data from the CLDR, for things like the first day of the week.
cmlenz
parents: 3
diff changeset
236 if territory in territories or any([r in territories for r in regions]):
0ca5dd65594f Pull in some supplemental data from the CLDR, for things like the first day of the week.
cmlenz
parents: 3
diff changeset
237 week_data['weekend_end'] = weekdays[elem.attrib['day']]
0ca5dd65594f Pull in some supplemental data from the CLDR, for things like the first day of the week.
cmlenz
parents: 3
diff changeset
238
235
d0cd235ede46 Upgraded to CLDR 1.5 and improved timezone formatting.
cmlenz
parents: 127
diff changeset
239 zone_formats = data.setdefault('zone_formats', {})
d0cd235ede46 Upgraded to CLDR 1.5 and improved timezone formatting.
cmlenz
parents: 127
diff changeset
240 for elem in tree.findall('//timeZoneNames/gmtFormat'):
377
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
241 if 'draft' not in elem.attrib and 'alt' not in elem.attrib:
235
d0cd235ede46 Upgraded to CLDR 1.5 and improved timezone formatting.
cmlenz
parents: 127
diff changeset
242 zone_formats['gmt'] = unicode(elem.text).replace('{0}', '%s')
d0cd235ede46 Upgraded to CLDR 1.5 and improved timezone formatting.
cmlenz
parents: 127
diff changeset
243 break
d0cd235ede46 Upgraded to CLDR 1.5 and improved timezone formatting.
cmlenz
parents: 127
diff changeset
244 for elem in tree.findall('//timeZoneNames/regionFormat'):
377
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
245 if 'draft' not in elem.attrib and 'alt' not in elem.attrib:
235
d0cd235ede46 Upgraded to CLDR 1.5 and improved timezone formatting.
cmlenz
parents: 127
diff changeset
246 zone_formats['region'] = unicode(elem.text).replace('{0}', '%s')
d0cd235ede46 Upgraded to CLDR 1.5 and improved timezone formatting.
cmlenz
parents: 127
diff changeset
247 break
d0cd235ede46 Upgraded to CLDR 1.5 and improved timezone formatting.
cmlenz
parents: 127
diff changeset
248 for elem in tree.findall('//timeZoneNames/fallbackFormat'):
377
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
249 if 'draft' not in elem.attrib and 'alt' not in elem.attrib:
235
d0cd235ede46 Upgraded to CLDR 1.5 and improved timezone formatting.
cmlenz
parents: 127
diff changeset
250 zone_formats['fallback'] = unicode(elem.text) \
d0cd235ede46 Upgraded to CLDR 1.5 and improved timezone formatting.
cmlenz
parents: 127
diff changeset
251 .replace('{0}', '%(0)s').replace('{1}', '%(1)s')
d0cd235ede46 Upgraded to CLDR 1.5 and improved timezone formatting.
cmlenz
parents: 127
diff changeset
252 break
d0cd235ede46 Upgraded to CLDR 1.5 and improved timezone formatting.
cmlenz
parents: 127
diff changeset
253
3
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
254 time_zones = data.setdefault('time_zones', {})
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
255 for elem in tree.findall('//timeZoneNames/zone'):
30
9a00ac84004c Import basic timezone info from CLDR (see #3). Still missing a couple other pieces in the puzzle.
cmlenz
parents: 28
diff changeset
256 info = {}
9a00ac84004c Import basic timezone info from CLDR (see #3). Still missing a couple other pieces in the puzzle.
cmlenz
parents: 28
diff changeset
257 city = elem.findtext('exemplarCity')
9a00ac84004c Import basic timezone info from CLDR (see #3). Still missing a couple other pieces in the puzzle.
cmlenz
parents: 28
diff changeset
258 if city:
9a00ac84004c Import basic timezone info from CLDR (see #3). Still missing a couple other pieces in the puzzle.
cmlenz
parents: 28
diff changeset
259 info['city'] = unicode(city)
9a00ac84004c Import basic timezone info from CLDR (see #3). Still missing a couple other pieces in the puzzle.
cmlenz
parents: 28
diff changeset
260 for child in elem.findall('long/*'):
9a00ac84004c Import basic timezone info from CLDR (see #3). Still missing a couple other pieces in the puzzle.
cmlenz
parents: 28
diff changeset
261 info.setdefault('long', {})[child.tag] = unicode(child.text)
9a00ac84004c Import basic timezone info from CLDR (see #3). Still missing a couple other pieces in the puzzle.
cmlenz
parents: 28
diff changeset
262 for child in elem.findall('short/*'):
9a00ac84004c Import basic timezone info from CLDR (see #3). Still missing a couple other pieces in the puzzle.
cmlenz
parents: 28
diff changeset
263 info.setdefault('short', {})[child.tag] = unicode(child.text)
9a00ac84004c Import basic timezone info from CLDR (see #3). Still missing a couple other pieces in the puzzle.
cmlenz
parents: 28
diff changeset
264 time_zones[elem.attrib['type']] = info
3
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
265
235
d0cd235ede46 Upgraded to CLDR 1.5 and improved timezone formatting.
cmlenz
parents: 127
diff changeset
266 meta_zones = data.setdefault('meta_zones', {})
d0cd235ede46 Upgraded to CLDR 1.5 and improved timezone formatting.
cmlenz
parents: 127
diff changeset
267 for elem in tree.findall('//timeZoneNames/metazone'):
d0cd235ede46 Upgraded to CLDR 1.5 and improved timezone formatting.
cmlenz
parents: 127
diff changeset
268 info = {}
d0cd235ede46 Upgraded to CLDR 1.5 and improved timezone formatting.
cmlenz
parents: 127
diff changeset
269 city = elem.findtext('exemplarCity')
d0cd235ede46 Upgraded to CLDR 1.5 and improved timezone formatting.
cmlenz
parents: 127
diff changeset
270 if city:
d0cd235ede46 Upgraded to CLDR 1.5 and improved timezone formatting.
cmlenz
parents: 127
diff changeset
271 info['city'] = unicode(city)
d0cd235ede46 Upgraded to CLDR 1.5 and improved timezone formatting.
cmlenz
parents: 127
diff changeset
272 for child in elem.findall('long/*'):
d0cd235ede46 Upgraded to CLDR 1.5 and improved timezone formatting.
cmlenz
parents: 127
diff changeset
273 info.setdefault('long', {})[child.tag] = unicode(child.text)
d0cd235ede46 Upgraded to CLDR 1.5 and improved timezone formatting.
cmlenz
parents: 127
diff changeset
274 for child in elem.findall('short/*'):
d0cd235ede46 Upgraded to CLDR 1.5 and improved timezone formatting.
cmlenz
parents: 127
diff changeset
275 info.setdefault('short', {})[child.tag] = unicode(child.text)
d0cd235ede46 Upgraded to CLDR 1.5 and improved timezone formatting.
cmlenz
parents: 127
diff changeset
276 info['common'] = elem.findtext('commonlyUsed') == 'true'
d0cd235ede46 Upgraded to CLDR 1.5 and improved timezone formatting.
cmlenz
parents: 127
diff changeset
277 meta_zones[elem.attrib['type']] = info
36
2e143f1a0003 Extended time-zone support.
cmlenz
parents: 35
diff changeset
278
3
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
279 for calendar in tree.findall('//calendars/calendar'):
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
280 if calendar.attrib['type'] != 'gregorian':
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
281 # TODO: support other calendar types
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
282 continue
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
283
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
284 months = data.setdefault('months', {})
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
285 for ctxt in calendar.findall('months/monthContext'):
377
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
286 ctxt_type = ctxt.attrib['type']
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
287 ctxts = months.setdefault(ctxt_type, {})
3
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
288 for width in ctxt.findall('monthWidth'):
377
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
289 width_type = width.attrib['type']
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
290 widths = ctxts.setdefault(width_type, {})
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
291 for elem in width.getiterator():
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
292 if elem.tag == 'month':
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
293 if ('draft' in elem.attrib or 'alt' in elem.attrib) \
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
294 and int(elem.attrib['type']) in widths:
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
295 continue
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
296 widths[int(elem.attrib.get('type'))] = unicode(elem.text)
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
297 elif elem.tag == 'alias':
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
298 ctxts[width_type] = Alias(
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
299 _translate_alias(['months', ctxt_type, width_type],
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
300 elem.attrib['path'])
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
301 )
3
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
302
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
303 days = data.setdefault('days', {})
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
304 for ctxt in calendar.findall('days/dayContext'):
377
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
305 ctxt_type = ctxt.attrib['type']
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
306 ctxts = days.setdefault(ctxt_type, {})
3
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
307 for width in ctxt.findall('dayWidth'):
377
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
308 width_type = width.attrib['type']
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
309 widths = ctxts.setdefault(width_type, {})
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
310 for elem in width.getiterator():
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
311 if elem.tag == 'day':
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
312 dtype = weekdays[elem.attrib['type']]
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
313 if ('draft' in elem.attrib or 'alt' not in elem.attrib) \
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
314 and dtype in widths:
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
315 continue
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
316 widths[dtype] = unicode(elem.text)
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
317 elif elem.tag == 'alias':
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
318 ctxts[width_type] = Alias(
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
319 _translate_alias(['days', ctxt_type, width_type],
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
320 elem.attrib['path'])
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
321 )
3
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
322
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
323 quarters = data.setdefault('quarters', {})
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
324 for ctxt in calendar.findall('quarters/quarterContext'):
377
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
325 ctxt_type = ctxt.attrib['type']
3
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
326 ctxts = quarters.setdefault(ctxt.attrib['type'], {})
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
327 for width in ctxt.findall('quarterWidth'):
377
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
328 width_type = width.attrib['type']
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
329 widths = ctxts.setdefault(width_type, {})
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
330 for elem in width.getiterator():
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
331 if elem.tag == 'quarter':
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
332 if ('draft' in elem.attrib or 'alt' in elem.attrib) \
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
333 and int(elem.attrib['type']) in widths:
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
334 continue
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
335 widths[int(elem.attrib['type'])] = unicode(elem.text)
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
336 elif elem.tag == 'alias':
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
337 ctxts[width_type] = Alias(
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
338 _translate_alias(['quarters', ctxt_type, width_type],
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
339 elem.attrib['path'])
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
340 )
3
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
341
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
342 eras = data.setdefault('eras', {})
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
343 for width in calendar.findall('eras/*'):
377
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
344 width_type = NAME_MAP[width.tag]
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
345 widths = eras.setdefault(width_type, {})
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
346 for elem in width.getiterator():
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
347 if elem.tag == 'era':
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
348 if ('draft' in elem.attrib or 'alt' in elem.attrib) \
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
349 and int(elem.attrib['type']) in widths:
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
350 continue
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
351 widths[int(elem.attrib.get('type'))] = unicode(elem.text)
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
352 elif elem.tag == 'alias':
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
353 eras[width_type] = Alias(
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
354 _translate_alias(['eras', width_type],
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
355 elem.attrib['path'])
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
356 )
3
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
357
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
358 # AM/PM
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
359 periods = data.setdefault('periods', {})
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
360 for elem in calendar.findall('am'):
377
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
361 if ('draft' in elem.attrib or 'alt' in elem.attrib) \
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
362 and elem.tag in periods:
3
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
363 continue
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
364 periods[elem.tag] = unicode(elem.text)
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
365 for elem in calendar.findall('pm'):
377
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
366 if ('draft' in elem.attrib or 'alt' in elem.attrib) \
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
367 and elem.tag in periods:
3
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
368 continue
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
369 periods[elem.tag] = unicode(elem.text)
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
370
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
371 date_formats = data.setdefault('date_formats', {})
377
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
372 for format in calendar.findall('dateFormats'):
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
373 for elem in format.getiterator():
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
374 if elem.tag == 'dateFormatLength':
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
375 if 'draft' in elem.attrib and \
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
376 elem.attrib.get('type') in date_formats:
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
377 continue
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
378 try:
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
379 date_formats[elem.attrib.get('type')] = \
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
380 dates.parse_pattern(unicode(elem.findtext('dateFormat/pattern')))
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
381 except ValueError, e:
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
382 print>>sys.stderr, 'ERROR: %s' % e
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
383 elif elem.tag == 'alias':
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
384 date_formats = Alias(_translate_alias(
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
385 ['date_formats'], elem.attrib['path'])
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
386 )
3
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
387
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
388 time_formats = data.setdefault('time_formats', {})
377
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
389 for format in calendar.findall('timeFormats'):
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
390 for elem in format.getiterator():
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
391 if elem.tag == 'timeFormatLength':
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
392 if ('draft' in elem.attrib or 'alt' in elem.attrib) \
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
393 and elem.attrib.get('type') in time_formats:
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
394 continue
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
395 try:
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
396 time_formats[elem.attrib.get('type')] = \
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
397 dates.parse_pattern(unicode(elem.findtext('timeFormat/pattern')))
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
398 except ValueError, e:
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
399 print>>sys.stderr, 'ERROR: %s' % e
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
400 elif elem.tag == 'alias':
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
401 time_formats = Alias(_translate_alias(
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
402 ['time_formats'], elem.attrib['path'])
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
403 )
3
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
404
35
0505d666fa1f * Import datetime patterns from CLDR.
cmlenz
parents: 30
diff changeset
405 datetime_formats = data.setdefault('datetime_formats', {})
377
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
406 for format in calendar.findall('dateTimeFormats'):
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
407 for elem in format.getiterator():
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
408 if elem.tag == 'dateTimeFormatLength':
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
409 if ('draft' in elem.attrib or 'alt' in elem.attrib) \
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
410 and elem.attrib.get('type') in datetime_formats:
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
411 continue
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
412 try:
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
413 datetime_formats[elem.attrib.get('type')] = \
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
414 unicode(elem.findtext('dateTimeFormat/pattern'))
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
415 except ValueError, e:
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
416 print>>sys.stderr, 'ERROR: %s' % e
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
417 elif elem.tag == 'alias':
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
418 datetime_formats = Alias(_translate_alias(
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
419 ['datetime_formats'], elem.attrib['path'])
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
420 )
35
0505d666fa1f * Import datetime patterns from CLDR.
cmlenz
parents: 30
diff changeset
421
3
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
422 # <numbers>
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
423
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
424 number_symbols = data.setdefault('number_symbols', {})
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
425 for elem in tree.findall('//numbers/symbols/*'):
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
426 number_symbols[elem.tag] = unicode(elem.text)
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
427
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
428 decimal_formats = data.setdefault('decimal_formats', {})
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
429 for elem in tree.findall('//decimalFormats/decimalFormatLength'):
377
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
430 if ('draft' in elem.attrib or 'alt' in elem.attrib) \
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
431 and elem.attrib.get('type') in decimal_formats:
3
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
432 continue
28
695884591af6 * Reduce size of locale data pickles by only storing the data provided by each locale itself, and merging inherited data at runtime.
cmlenz
parents: 24
diff changeset
433 pattern = unicode(elem.findtext('decimalFormat/pattern'))
695884591af6 * Reduce size of locale data pickles by only storing the data provided by each locale itself, and merging inherited data at runtime.
cmlenz
parents: 24
diff changeset
434 decimal_formats[elem.attrib.get('type')] = numbers.parse_pattern(pattern)
3
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
435
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
436 scientific_formats = data.setdefault('scientific_formats', {})
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
437 for elem in tree.findall('//scientificFormats/scientificFormatLength'):
377
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
438 if ('draft' in elem.attrib or 'alt' in elem.attrib) \
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
439 and elem.attrib.get('type') in scientific_formats:
3
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
440 continue
127
a72de8971819 Add currency formatting.
cmlenz
parents: 67
diff changeset
441 pattern = unicode(elem.findtext('scientificFormat/pattern'))
a72de8971819 Add currency formatting.
cmlenz
parents: 67
diff changeset
442 scientific_formats[elem.attrib.get('type')] = numbers.parse_pattern(pattern)
3
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
443
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
444 currency_formats = data.setdefault('currency_formats', {})
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
445 for elem in tree.findall('//currencyFormats/currencyFormatLength'):
377
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
446 if ('draft' in elem.attrib or 'alt' in elem.attrib) \
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
447 and elem.attrib.get('type') in currency_formats:
3
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
448 continue
127
a72de8971819 Add currency formatting.
cmlenz
parents: 67
diff changeset
449 pattern = unicode(elem.findtext('currencyFormat/pattern'))
a72de8971819 Add currency formatting.
cmlenz
parents: 67
diff changeset
450 currency_formats[elem.attrib.get('type')] = numbers.parse_pattern(pattern)
3
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
451
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
452 percent_formats = data.setdefault('percent_formats', {})
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
453 for elem in tree.findall('//percentFormats/percentFormatLength'):
377
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
454 if ('draft' in elem.attrib or 'alt' in elem.attrib) \
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
455 and elem.attrib.get('type') in percent_formats:
3
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
456 continue
28
695884591af6 * Reduce size of locale data pickles by only storing the data provided by each locale itself, and merging inherited data at runtime.
cmlenz
parents: 24
diff changeset
457 pattern = unicode(elem.findtext('percentFormat/pattern'))
695884591af6 * Reduce size of locale data pickles by only storing the data provided by each locale itself, and merging inherited data at runtime.
cmlenz
parents: 24
diff changeset
458 percent_formats[elem.attrib.get('type')] = numbers.parse_pattern(pattern)
3
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
459
28
695884591af6 * Reduce size of locale data pickles by only storing the data provided by each locale itself, and merging inherited data at runtime.
cmlenz
parents: 24
diff changeset
460 currency_names = data.setdefault('currency_names', {})
695884591af6 * Reduce size of locale data pickles by only storing the data provided by each locale itself, and merging inherited data at runtime.
cmlenz
parents: 24
diff changeset
461 currency_symbols = data.setdefault('currency_symbols', {})
3
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
462 for elem in tree.findall('//currencies/currency'):
387
88e3589ca8df Improve CLDR import of currency-related data to ignore unsupported features such as symbol choice patterns and pluralized display names. See #93.
cmlenz
parents: 377
diff changeset
463 code = elem.attrib['type']
88e3589ca8df Improve CLDR import of currency-related data to ignore unsupported features such as symbol choice patterns and pluralized display names. See #93.
cmlenz
parents: 377
diff changeset
464 # TODO: support plural rules for currency name selection
88e3589ca8df Improve CLDR import of currency-related data to ignore unsupported features such as symbol choice patterns and pluralized display names. See #93.
cmlenz
parents: 377
diff changeset
465 for name in elem.findall('displayName'):
88e3589ca8df Improve CLDR import of currency-related data to ignore unsupported features such as symbol choice patterns and pluralized display names. See #93.
cmlenz
parents: 377
diff changeset
466 if ('draft' in name.attrib or 'count' in name.attrib) \
88e3589ca8df Improve CLDR import of currency-related data to ignore unsupported features such as symbol choice patterns and pluralized display names. See #93.
cmlenz
parents: 377
diff changeset
467 and code in currency_names:
88e3589ca8df Improve CLDR import of currency-related data to ignore unsupported features such as symbol choice patterns and pluralized display names. See #93.
cmlenz
parents: 377
diff changeset
468 continue
88e3589ca8df Improve CLDR import of currency-related data to ignore unsupported features such as symbol choice patterns and pluralized display names. See #93.
cmlenz
parents: 377
diff changeset
469 currency_names[code] = unicode(name.text)
88e3589ca8df Improve CLDR import of currency-related data to ignore unsupported features such as symbol choice patterns and pluralized display names. See #93.
cmlenz
parents: 377
diff changeset
470 # TODO: support choice patterns for currency symbol selection
88e3589ca8df Improve CLDR import of currency-related data to ignore unsupported features such as symbol choice patterns and pluralized display names. See #93.
cmlenz
parents: 377
diff changeset
471 symbol = elem.find('symbol')
88e3589ca8df Improve CLDR import of currency-related data to ignore unsupported features such as symbol choice patterns and pluralized display names. See #93.
cmlenz
parents: 377
diff changeset
472 if symbol is not None and 'draft' not in symbol.attrib \
88e3589ca8df Improve CLDR import of currency-related data to ignore unsupported features such as symbol choice patterns and pluralized display names. See #93.
cmlenz
parents: 377
diff changeset
473 and 'choice' not in symbol.attrib:
88e3589ca8df Improve CLDR import of currency-related data to ignore unsupported features such as symbol choice patterns and pluralized display names. See #93.
cmlenz
parents: 377
diff changeset
474 currency_symbols[code] = unicode(symbol.text)
3
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
475
392
34c0a25b1ed7 Preliminary support for timedelta formatting (see #126), and import/expose the locale plural rules from the CLDR.
cmlenz
parents: 387
diff changeset
476 # <units>
34c0a25b1ed7 Preliminary support for timedelta formatting (see #126), and import/expose the locale plural rules from the CLDR.
cmlenz
parents: 387
diff changeset
477
34c0a25b1ed7 Preliminary support for timedelta formatting (see #126), and import/expose the locale plural rules from the CLDR.
cmlenz
parents: 387
diff changeset
478 unit_patterns = data.setdefault('unit_patterns', {})
34c0a25b1ed7 Preliminary support for timedelta formatting (see #126), and import/expose the locale plural rules from the CLDR.
cmlenz
parents: 387
diff changeset
479 for elem in tree.findall('//units/unit'):
34c0a25b1ed7 Preliminary support for timedelta formatting (see #126), and import/expose the locale plural rules from the CLDR.
cmlenz
parents: 387
diff changeset
480 unit_type = elem.attrib['type']
34c0a25b1ed7 Preliminary support for timedelta formatting (see #126), and import/expose the locale plural rules from the CLDR.
cmlenz
parents: 387
diff changeset
481 unit_pattern = unit_patterns.setdefault(unit_type, {})
34c0a25b1ed7 Preliminary support for timedelta formatting (see #126), and import/expose the locale plural rules from the CLDR.
cmlenz
parents: 387
diff changeset
482 for pattern in elem.findall('unitPattern'):
34c0a25b1ed7 Preliminary support for timedelta formatting (see #126), and import/expose the locale plural rules from the CLDR.
cmlenz
parents: 387
diff changeset
483 unit_patterns[unit_type][pattern.attrib['count']] = \
34c0a25b1ed7 Preliminary support for timedelta formatting (see #126), and import/expose the locale plural rules from the CLDR.
cmlenz
parents: 387
diff changeset
484 unicode(pattern.text)
34c0a25b1ed7 Preliminary support for timedelta formatting (see #126), and import/expose the locale plural rules from the CLDR.
cmlenz
parents: 387
diff changeset
485
235
d0cd235ede46 Upgraded to CLDR 1.5 and improved timezone formatting.
cmlenz
parents: 127
diff changeset
486 outfile = open(os.path.join(destdir, 'localedata', stem + '.dat'), 'wb')
3
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
487 try:
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
488 pickle.dump(data, outfile, 2)
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
489 finally:
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
490 outfile.close()
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
491
377
841858d5b567 Implement support for aliases in the CLDR data. Closes #68. Also, update to CLDR 1.6, and a much improved `dump_data` script.
cmlenz
parents: 347
diff changeset
492
3
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
493 if __name__ == '__main__':
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
494 main()
Copyright (C) 2012-2017 Edgewall Software