babel/mirror: scripts/import

annotate scripts/import_cldr.py @ 400:cdf6daa1e3cc stable-0.9.x

Ported [438] and [439] back to 0.9.x branch.

author	cmlenz
date	Fri, 18 Jul 2008 13:10:46 +0000
parents	a11564c5c1f1
children	c5bc0f6822a9 74de1a99a312

rev	line source
1 7870274479f5 Import of initial code base. cmlenz parents: diff changeset	1 #!/usr/bin/env python
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	2 # -- coding: utf-8 --
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	3 #
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	4 # Copyright (C) 2007 Edgewall Software
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	5 # All rights reserved.
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	6 #
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	7 # This software is licensed as described in the file COPYING, which
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	8 # you should have received as part of this distribution. The terms
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	9 # are also available at http://babel.edgewall.org/wiki/License.
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	10 #
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	11 # This software consists of voluntary contributions made by many
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	12 # individuals. For the exact contribution history, see the revision
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	13 # history and logs, available at http://babel.edgewall.org/log/.
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	14
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	15 import copy
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	16 from optparse import OptionParser
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	17 import os
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	18 import pickle
379 0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	19 import re
1 7870274479f5 Import of initial code base. cmlenz parents: diff changeset	20 import sys
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	21 try:
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	22 from xml.etree.ElementTree import parse
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	23 except ImportError:
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	24 from elementtree.ElementTree import parse
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	25
65 75fe8369ed3b Add Babel soruce path to CLDR import script automatically for asmodai ;-). cmlenz parents: 34 diff changeset	26 # Make sure we're using Babel source, and not some previously installed version
75fe8369ed3b Add Babel soruce path to CLDR import script automatically for asmodai ;-). cmlenz parents: 34 diff changeset	27 sys.path.insert(0, os.path.join(os.path.dirname(sys.argv[0]), '..'))
75fe8369ed3b Add Babel soruce path to CLDR import script automatically for asmodai ;-). cmlenz parents: 34 diff changeset	28
9 9ed6cf5975a1 Add basic support for number format patterns. jonas parents: 8 diff changeset	29 from babel import dates, numbers
379 0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	30 from babel.localedata import Alias
1 7870274479f5 Import of initial code base. cmlenz parents: diff changeset	31
15 244a74232f5e Minor date formatting improvements. cmlenz parents: 13 diff changeset	32 weekdays = {'mon': 0, 'tue': 1, 'wed': 2, 'thu': 3, 'fri': 4, 'sat': 5,
244a74232f5e Minor date formatting improvements. cmlenz parents: 13 diff changeset	33 'sun': 6}
8 29f6f9a90f14 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	34
29f6f9a90f14 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	35 try:
29f6f9a90f14 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	36 any
29f6f9a90f14 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	37 except NameError:
29f6f9a90f14 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	38 def any(iterable):
29f6f9a90f14 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	39 return filter(None, list(iterable))
29f6f9a90f14 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	40
379 0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	41
1 7870274479f5 Import of initial code base. cmlenz parents: diff changeset	42 def _text(elem):
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	43 buf = [elem.text or '']
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	44 for child in elem:
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	45 buf.append(_text(child))
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	46 buf.append(elem.tail or '')
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	47 return u''.join(filter(None, buf)).strip()
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	48
379 0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	49
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	50 NAME_RE = re.compile(r"^\w+$")
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	51 TYPE_ATTR_RE = re.compile(r"^\w+\[@type='(.*?)'\]$")
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	52
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	53 NAME_MAP = {
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	54 'dateFormats': 'date_formats',
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	55 'dateTimeFormats': 'datetime_formats',
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	56 'eraAbbr': 'abbreviated',
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	57 'eraNames': 'wide',
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	58 'eraNarrow': 'narrow',
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	59 'timeFormats': 'time_formats'
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	60 }
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	61
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	62 def _translate_alias(ctxt, path):
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	63 parts = path.split('/')
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	64 keys = ctxt[:]
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	65 for part in parts:
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	66 if part == '..':
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	67 keys.pop()
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	68 else:
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	69 match = TYPE_ATTR_RE.match(part)
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	70 if match:
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	71 keys.append(match.group(1))
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	72 else:
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	73 assert NAME_RE.match(part)
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	74 keys.append(NAME_MAP.get(part, part))
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	75 return keys
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	76
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	77
1 7870274479f5 Import of initial code base. cmlenz parents: diff changeset	78 def main():
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	79 parser = OptionParser(usage='%prog path/to/cldr')
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	80 options, args = parser.parse_args()
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	81 if len(args) != 1:
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	82 parser.error('incorrect number of arguments')
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	83
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	84 srcdir = args[0]
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	85 destdir = os.path.join(os.path.dirname(os.path.abspath(sys.argv[0])),
233 da97a3138239 Upgraded to CLDR 1.5 and improved timezone formatting. cmlenz parents: 125 diff changeset	86 '..', 'babel')
1 7870274479f5 Import of initial code base. cmlenz parents: diff changeset	87
8 29f6f9a90f14 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	88 sup = parse(os.path.join(srcdir, 'supplemental', 'supplementalData.xml'))
29f6f9a90f14 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	89
346 5e58ea360a5c Merged revisions [358:360], [364:370], [373:378], [380:382] from [source:trunk]. cmlenz parents: 233 diff changeset	90 # Import global data from the supplemental files
233 da97a3138239 Upgraded to CLDR 1.5 and improved timezone formatting. cmlenz parents: 125 diff changeset	91 global_data = {}
da97a3138239 Upgraded to CLDR 1.5 and improved timezone formatting. cmlenz parents: 125 diff changeset	92
da97a3138239 Upgraded to CLDR 1.5 and improved timezone formatting. cmlenz parents: 125 diff changeset	93 territory_zones = global_data.setdefault('territory_zones', {})
da97a3138239 Upgraded to CLDR 1.5 and improved timezone formatting. cmlenz parents: 125 diff changeset	94 zone_aliases = global_data.setdefault('zone_aliases', {})
da97a3138239 Upgraded to CLDR 1.5 and improved timezone formatting. cmlenz parents: 125 diff changeset	95 zone_territories = global_data.setdefault('zone_territories', {})
da97a3138239 Upgraded to CLDR 1.5 and improved timezone formatting. cmlenz parents: 125 diff changeset	96 for elem in sup.findall('//timezoneData/zoneFormatting/zoneItem'):
da97a3138239 Upgraded to CLDR 1.5 and improved timezone formatting. cmlenz parents: 125 diff changeset	97 tzid = elem.attrib['type']
da97a3138239 Upgraded to CLDR 1.5 and improved timezone formatting. cmlenz parents: 125 diff changeset	98 territory_zones.setdefault(elem.attrib['territory'], []).append(tzid)
da97a3138239 Upgraded to CLDR 1.5 and improved timezone formatting. cmlenz parents: 125 diff changeset	99 zone_territories[tzid] = elem.attrib['territory']
da97a3138239 Upgraded to CLDR 1.5 and improved timezone formatting. cmlenz parents: 125 diff changeset	100 if 'aliases' in elem.attrib:
da97a3138239 Upgraded to CLDR 1.5 and improved timezone formatting. cmlenz parents: 125 diff changeset	101 for alias in elem.attrib['aliases'].split():
da97a3138239 Upgraded to CLDR 1.5 and improved timezone formatting. cmlenz parents: 125 diff changeset	102 zone_aliases[alias] = tzid
da97a3138239 Upgraded to CLDR 1.5 and improved timezone formatting. cmlenz parents: 125 diff changeset	103
346 5e58ea360a5c Merged revisions [358:360], [364:370], [373:378], [380:382] from [source:trunk]. cmlenz parents: 233 diff changeset	104 # Import Metazone mapping
5e58ea360a5c Merged revisions [358:360], [364:370], [373:378], [380:382] from [source:trunk]. cmlenz parents: 233 diff changeset	105 meta_zones = global_data.setdefault('meta_zones', {})
5e58ea360a5c Merged revisions [358:360], [364:370], [373:378], [380:382] from [source:trunk]. cmlenz parents: 233 diff changeset	106 tzsup = parse(os.path.join(srcdir, 'supplemental', 'metazoneInfo.xml'))
5e58ea360a5c Merged revisions [358:360], [364:370], [373:378], [380:382] from [source:trunk]. cmlenz parents: 233 diff changeset	107 for elem in tzsup.findall('//timezone'):
5e58ea360a5c Merged revisions [358:360], [364:370], [373:378], [380:382] from [source:trunk]. cmlenz parents: 233 diff changeset	108 for child in elem.findall('usesMetazone'):
5e58ea360a5c Merged revisions [358:360], [364:370], [373:378], [380:382] from [source:trunk]. cmlenz parents: 233 diff changeset	109 if 'to' not in child.attrib: # FIXME: support old mappings
5e58ea360a5c Merged revisions [358:360], [364:370], [373:378], [380:382] from [source:trunk]. cmlenz parents: 233 diff changeset	110 meta_zones[elem.attrib['type']] = child.attrib['mzone']
5e58ea360a5c Merged revisions [358:360], [364:370], [373:378], [380:382] from [source:trunk]. cmlenz parents: 233 diff changeset	111
233 da97a3138239 Upgraded to CLDR 1.5 and improved timezone formatting. cmlenz parents: 125 diff changeset	112 outfile = open(os.path.join(destdir, 'global.dat'), 'wb')
da97a3138239 Upgraded to CLDR 1.5 and improved timezone formatting. cmlenz parents: 125 diff changeset	113 try:
da97a3138239 Upgraded to CLDR 1.5 and improved timezone formatting. cmlenz parents: 125 diff changeset	114 pickle.dump(global_data, outfile, 2)
da97a3138239 Upgraded to CLDR 1.5 and improved timezone formatting. cmlenz parents: 125 diff changeset	115 finally:
da97a3138239 Upgraded to CLDR 1.5 and improved timezone formatting. cmlenz parents: 125 diff changeset	116 outfile.close()
da97a3138239 Upgraded to CLDR 1.5 and improved timezone formatting. cmlenz parents: 125 diff changeset	117
8 29f6f9a90f14 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	118 # build a territory containment mapping for inheritance
29f6f9a90f14 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	119 regions = {}
29f6f9a90f14 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	120 for elem in sup.findall('//territoryContainment/group'):
29f6f9a90f14 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	121 regions[elem.attrib['type']] = elem.attrib['contains'].split()
29f6f9a90f14 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	122
29f6f9a90f14 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	123 # Resolve territory containment
29f6f9a90f14 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	124 territory_containment = {}
29f6f9a90f14 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	125 region_items = regions.items()
29f6f9a90f14 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	126 region_items.sort()
29f6f9a90f14 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	127 for group, territory_list in region_items:
29f6f9a90f14 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	128 for territory in territory_list:
29f6f9a90f14 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	129 containers = territory_containment.setdefault(territory, set([]))
29f6f9a90f14 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	130 if group in territory_containment:
29f6f9a90f14 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	131 containers \|= territory_containment[group]
29f6f9a90f14 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	132 containers.add(group)
29f6f9a90f14 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	133
1 7870274479f5 Import of initial code base. cmlenz parents: diff changeset	134 filenames = os.listdir(os.path.join(srcdir, 'main'))
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	135 filenames.remove('root.xml')
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	136 filenames.sort(lambda a,b: len(a)-len(b))
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	137 filenames.insert(0, 'root.xml')
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	138
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	139 for filename in filenames:
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	140 stem, ext = os.path.splitext(filename)
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	141 if ext != '.xml':
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	142 continue
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	143
389 a11564c5c1f1 Ported [424], [425], and [428] back to 0.9.x branch. cmlenz parents: 379 diff changeset	144 print>>sys.stderr, 'Processing input file %r' % filename
26 6041782ea677 * Reduce size of locale data pickles by only storing the data provided by each locale itself, and merging inherited data at runtime. cmlenz parents: 22 diff changeset	145 tree = parse(os.path.join(srcdir, 'main', filename))
1 7870274479f5 Import of initial code base. cmlenz parents: diff changeset	146 data = {}
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	147
8 29f6f9a90f14 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	148 language = None
29f6f9a90f14 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	149 elem = tree.find('//identity/language')
29f6f9a90f14 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	150 if elem is not None:
29f6f9a90f14 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	151 language = elem.attrib['type']
29f6f9a90f14 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	152 print>>sys.stderr, ' Language: %r' % language
29f6f9a90f14 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	153
29f6f9a90f14 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	154 territory = None
29f6f9a90f14 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	155 elem = tree.find('//identity/territory')
29f6f9a90f14 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	156 if elem is not None:
29f6f9a90f14 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	157 territory = elem.attrib['type']
13 368650dc3423 Extended and documented `LazyProxy`. cmlenz parents: 9 diff changeset	158 else:
368650dc3423 Extended and documented `LazyProxy`. cmlenz parents: 9 diff changeset	159 territory = '001' # world
8 29f6f9a90f14 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	160 print>>sys.stderr, ' Territory: %r' % territory
29f6f9a90f14 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	161 regions = territory_containment.get(territory, [])
29f6f9a90f14 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	162 print>>sys.stderr, ' Regions: %r' % regions
29f6f9a90f14 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	163
1 7870274479f5 Import of initial code base. cmlenz parents: diff changeset	164 # <localeDisplayNames>
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	165
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	166 territories = data.setdefault('territories', {})
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	167 for elem in tree.findall('//territories/territory'):
379 0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	168 if ('draft' in elem.attrib or 'alt' in elem.attrib) \
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	169 and elem.attrib['type'] in territories:
1 7870274479f5 Import of initial code base. cmlenz parents: diff changeset	170 continue
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	171 territories[elem.attrib['type']] = _text(elem)
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	172
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	173 languages = data.setdefault('languages', {})
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	174 for elem in tree.findall('//languages/language'):
379 0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	175 if ('draft' in elem.attrib or 'alt' in elem.attrib) \
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	176 and elem.attrib['type'] in languages:
1 7870274479f5 Import of initial code base. cmlenz parents: diff changeset	177 continue
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	178 languages[elem.attrib['type']] = _text(elem)
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	179
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	180 variants = data.setdefault('variants', {})
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	181 for elem in tree.findall('//variants/variant'):
379 0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	182 if ('draft' in elem.attrib or 'alt' in elem.attrib) \
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	183 and elem.attrib['type'] in variants:
1 7870274479f5 Import of initial code base. cmlenz parents: diff changeset	184 continue
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	185 variants[elem.attrib['type']] = _text(elem)
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	186
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	187 scripts = data.setdefault('scripts', {})
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	188 for elem in tree.findall('//scripts/script'):
379 0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	189 if ('draft' in elem.attrib or 'alt' in elem.attrib) \
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	190 and elem.attrib['type'] in scripts:
1 7870274479f5 Import of initial code base. cmlenz parents: diff changeset	191 continue
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	192 scripts[elem.attrib['type']] = _text(elem)
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	193
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	194 # <dates>
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	195
8 29f6f9a90f14 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	196 week_data = data.setdefault('week_data', {})
29f6f9a90f14 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	197 supelem = sup.find('//weekData')
29f6f9a90f14 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	198
29f6f9a90f14 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	199 for elem in supelem.findall('minDays'):
29f6f9a90f14 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	200 territories = elem.attrib['territories'].split()
29f6f9a90f14 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	201 if territory in territories or any([r in territories for r in regions]):
29f6f9a90f14 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	202 week_data['min_days'] = int(elem.attrib['count'])
29f6f9a90f14 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	203
29f6f9a90f14 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	204 for elem in supelem.findall('firstDay'):
29f6f9a90f14 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	205 territories = elem.attrib['territories'].split()
29f6f9a90f14 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	206 if territory in territories or any([r in territories for r in regions]):
29f6f9a90f14 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	207 week_data['first_day'] = weekdays[elem.attrib['day']]
29f6f9a90f14 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	208
29f6f9a90f14 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	209 for elem in supelem.findall('weekendStart'):
29f6f9a90f14 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	210 territories = elem.attrib['territories'].split()
29f6f9a90f14 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	211 if territory in territories or any([r in territories for r in regions]):
29f6f9a90f14 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	212 week_data['weekend_start'] = weekdays[elem.attrib['day']]
29f6f9a90f14 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	213
29f6f9a90f14 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	214 for elem in supelem.findall('weekendEnd'):
29f6f9a90f14 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	215 territories = elem.attrib['territories'].split()
29f6f9a90f14 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	216 if territory in territories or any([r in territories for r in regions]):
29f6f9a90f14 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	217 week_data['weekend_end'] = weekdays[elem.attrib['day']]
29f6f9a90f14 Pull in some supplemental data from the CLDR, for things like the first day of the week. cmlenz parents: 1 diff changeset	218
233 da97a3138239 Upgraded to CLDR 1.5 and improved timezone formatting. cmlenz parents: 125 diff changeset	219 zone_formats = data.setdefault('zone_formats', {})
da97a3138239 Upgraded to CLDR 1.5 and improved timezone formatting. cmlenz parents: 125 diff changeset	220 for elem in tree.findall('//timeZoneNames/gmtFormat'):
379 0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	221 if 'draft' not in elem.attrib and 'alt' not in elem.attrib:
233 da97a3138239 Upgraded to CLDR 1.5 and improved timezone formatting. cmlenz parents: 125 diff changeset	222 zone_formats['gmt'] = unicode(elem.text).replace('{0}', '%s')
da97a3138239 Upgraded to CLDR 1.5 and improved timezone formatting. cmlenz parents: 125 diff changeset	223 break
da97a3138239 Upgraded to CLDR 1.5 and improved timezone formatting. cmlenz parents: 125 diff changeset	224 for elem in tree.findall('//timeZoneNames/regionFormat'):
379 0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	225 if 'draft' not in elem.attrib and 'alt' not in elem.attrib:
233 da97a3138239 Upgraded to CLDR 1.5 and improved timezone formatting. cmlenz parents: 125 diff changeset	226 zone_formats['region'] = unicode(elem.text).replace('{0}', '%s')
da97a3138239 Upgraded to CLDR 1.5 and improved timezone formatting. cmlenz parents: 125 diff changeset	227 break
da97a3138239 Upgraded to CLDR 1.5 and improved timezone formatting. cmlenz parents: 125 diff changeset	228 for elem in tree.findall('//timeZoneNames/fallbackFormat'):
379 0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	229 if 'draft' not in elem.attrib and 'alt' not in elem.attrib:
233 da97a3138239 Upgraded to CLDR 1.5 and improved timezone formatting. cmlenz parents: 125 diff changeset	230 zone_formats['fallback'] = unicode(elem.text) \
da97a3138239 Upgraded to CLDR 1.5 and improved timezone formatting. cmlenz parents: 125 diff changeset	231 .replace('{0}', '%(0)s').replace('{1}', '%(1)s')
da97a3138239 Upgraded to CLDR 1.5 and improved timezone formatting. cmlenz parents: 125 diff changeset	232 break
da97a3138239 Upgraded to CLDR 1.5 and improved timezone formatting. cmlenz parents: 125 diff changeset	233
1 7870274479f5 Import of initial code base. cmlenz parents: diff changeset	234 time_zones = data.setdefault('time_zones', {})
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	235 for elem in tree.findall('//timeZoneNames/zone'):
28 b00b06e5ace8 Import basic timezone info from CLDR (see #3). Still missing a couple other pieces in the puzzle. cmlenz parents: 26 diff changeset	236 info = {}
b00b06e5ace8 Import basic timezone info from CLDR (see #3). Still missing a couple other pieces in the puzzle. cmlenz parents: 26 diff changeset	237 city = elem.findtext('exemplarCity')
b00b06e5ace8 Import basic timezone info from CLDR (see #3). Still missing a couple other pieces in the puzzle. cmlenz parents: 26 diff changeset	238 if city:
b00b06e5ace8 Import basic timezone info from CLDR (see #3). Still missing a couple other pieces in the puzzle. cmlenz parents: 26 diff changeset	239 info['city'] = unicode(city)
b00b06e5ace8 Import basic timezone info from CLDR (see #3). Still missing a couple other pieces in the puzzle. cmlenz parents: 26 diff changeset	240 for child in elem.findall('long/*'):
b00b06e5ace8 Import basic timezone info from CLDR (see #3). Still missing a couple other pieces in the puzzle. cmlenz parents: 26 diff changeset	241 info.setdefault('long', {})[child.tag] = unicode(child.text)
b00b06e5ace8 Import basic timezone info from CLDR (see #3). Still missing a couple other pieces in the puzzle. cmlenz parents: 26 diff changeset	242 for child in elem.findall('short/*'):
b00b06e5ace8 Import basic timezone info from CLDR (see #3). Still missing a couple other pieces in the puzzle. cmlenz parents: 26 diff changeset	243 info.setdefault('short', {})[child.tag] = unicode(child.text)
b00b06e5ace8 Import basic timezone info from CLDR (see #3). Still missing a couple other pieces in the puzzle. cmlenz parents: 26 diff changeset	244 time_zones[elem.attrib['type']] = info
1 7870274479f5 Import of initial code base. cmlenz parents: diff changeset	245
233 da97a3138239 Upgraded to CLDR 1.5 and improved timezone formatting. cmlenz parents: 125 diff changeset	246 meta_zones = data.setdefault('meta_zones', {})
da97a3138239 Upgraded to CLDR 1.5 and improved timezone formatting. cmlenz parents: 125 diff changeset	247 for elem in tree.findall('//timeZoneNames/metazone'):
da97a3138239 Upgraded to CLDR 1.5 and improved timezone formatting. cmlenz parents: 125 diff changeset	248 info = {}
da97a3138239 Upgraded to CLDR 1.5 and improved timezone formatting. cmlenz parents: 125 diff changeset	249 city = elem.findtext('exemplarCity')
da97a3138239 Upgraded to CLDR 1.5 and improved timezone formatting. cmlenz parents: 125 diff changeset	250 if city:
da97a3138239 Upgraded to CLDR 1.5 and improved timezone formatting. cmlenz parents: 125 diff changeset	251 info['city'] = unicode(city)
da97a3138239 Upgraded to CLDR 1.5 and improved timezone formatting. cmlenz parents: 125 diff changeset	252 for child in elem.findall('long/*'):
da97a3138239 Upgraded to CLDR 1.5 and improved timezone formatting. cmlenz parents: 125 diff changeset	253 info.setdefault('long', {})[child.tag] = unicode(child.text)
da97a3138239 Upgraded to CLDR 1.5 and improved timezone formatting. cmlenz parents: 125 diff changeset	254 for child in elem.findall('short/*'):
da97a3138239 Upgraded to CLDR 1.5 and improved timezone formatting. cmlenz parents: 125 diff changeset	255 info.setdefault('short', {})[child.tag] = unicode(child.text)
da97a3138239 Upgraded to CLDR 1.5 and improved timezone formatting. cmlenz parents: 125 diff changeset	256 info['common'] = elem.findtext('commonlyUsed') == 'true'
da97a3138239 Upgraded to CLDR 1.5 and improved timezone formatting. cmlenz parents: 125 diff changeset	257 meta_zones[elem.attrib['type']] = info
34 464fbcefedde Extended time-zone support. cmlenz parents: 33 diff changeset	258
1 7870274479f5 Import of initial code base. cmlenz parents: diff changeset	259 for calendar in tree.findall('//calendars/calendar'):
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	260 if calendar.attrib['type'] != 'gregorian':
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	261 # TODO: support other calendar types
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	262 continue
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	263
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	264 months = data.setdefault('months', {})
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	265 for ctxt in calendar.findall('months/monthContext'):
379 0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	266 ctxt_type = ctxt.attrib['type']
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	267 ctxts = months.setdefault(ctxt_type, {})
1 7870274479f5 Import of initial code base. cmlenz parents: diff changeset	268 for width in ctxt.findall('monthWidth'):
379 0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	269 width_type = width.attrib['type']
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	270 widths = ctxts.setdefault(width_type, {})
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	271 for elem in width.getiterator():
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	272 if elem.tag == 'month':
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	273 if ('draft' in elem.attrib or 'alt' in elem.attrib) \
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	274 and int(elem.attrib['type']) in widths:
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	275 continue
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	276 widths[int(elem.attrib.get('type'))] = unicode(elem.text)
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	277 elif elem.tag == 'alias':
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	278 ctxts[width_type] = Alias(
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	279 _translate_alias(['months', ctxt_type, width_type],
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	280 elem.attrib['path'])
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	281 )
1 7870274479f5 Import of initial code base. cmlenz parents: diff changeset	282
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	283 days = data.setdefault('days', {})
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	284 for ctxt in calendar.findall('days/dayContext'):
379 0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	285 ctxt_type = ctxt.attrib['type']
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	286 ctxts = days.setdefault(ctxt_type, {})
1 7870274479f5 Import of initial code base. cmlenz parents: diff changeset	287 for width in ctxt.findall('dayWidth'):
379 0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	288 width_type = width.attrib['type']
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	289 widths = ctxts.setdefault(width_type, {})
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	290 for elem in width.getiterator():
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	291 if elem.tag == 'day':
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	292 dtype = weekdays[elem.attrib['type']]
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	293 if ('draft' in elem.attrib or 'alt' not in elem.attrib) \
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	294 and dtype in widths:
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	295 continue
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	296 widths[dtype] = unicode(elem.text)
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	297 elif elem.tag == 'alias':
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	298 ctxts[width_type] = Alias(
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	299 _translate_alias(['days', ctxt_type, width_type],
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	300 elem.attrib['path'])
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	301 )
1 7870274479f5 Import of initial code base. cmlenz parents: diff changeset	302
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	303 quarters = data.setdefault('quarters', {})
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	304 for ctxt in calendar.findall('quarters/quarterContext'):
379 0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	305 ctxt_type = ctxt.attrib['type']
1 7870274479f5 Import of initial code base. cmlenz parents: diff changeset	306 ctxts = quarters.setdefault(ctxt.attrib['type'], {})
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	307 for width in ctxt.findall('quarterWidth'):
379 0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	308 width_type = width.attrib['type']
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	309 widths = ctxts.setdefault(width_type, {})
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	310 for elem in width.getiterator():
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	311 if elem.tag == 'quarter':
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	312 if ('draft' in elem.attrib or 'alt' in elem.attrib) \
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	313 and int(elem.attrib['type']) in widths:
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	314 continue
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	315 widths[int(elem.attrib['type'])] = unicode(elem.text)
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	316 elif elem.tag == 'alias':
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	317 ctxts[width_type] = Alias(
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	318 _translate_alias(['quarters', ctxt_type, width_type],
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	319 elem.attrib['path'])
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	320 )
1 7870274479f5 Import of initial code base. cmlenz parents: diff changeset	321
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	322 eras = data.setdefault('eras', {})
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	323 for width in calendar.findall('eras/*'):
379 0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	324 width_type = NAME_MAP[width.tag]
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	325 widths = eras.setdefault(width_type, {})
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	326 for elem in width.getiterator():
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	327 if elem.tag == 'era':
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	328 if ('draft' in elem.attrib or 'alt' in elem.attrib) \
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	329 and int(elem.attrib['type']) in widths:
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	330 continue
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	331 widths[int(elem.attrib.get('type'))] = unicode(elem.text)
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	332 elif elem.tag == 'alias':
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	333 eras[width_type] = Alias(
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	334 _translate_alias(['eras', width_type],
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	335 elem.attrib['path'])
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	336 )
1 7870274479f5 Import of initial code base. cmlenz parents: diff changeset	337
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	338 # AM/PM
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	339 periods = data.setdefault('periods', {})
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	340 for elem in calendar.findall('am'):
379 0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	341 if ('draft' in elem.attrib or 'alt' in elem.attrib) \
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	342 and elem.tag in periods:
1 7870274479f5 Import of initial code base. cmlenz parents: diff changeset	343 continue
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	344 periods[elem.tag] = unicode(elem.text)
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	345 for elem in calendar.findall('pm'):
379 0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	346 if ('draft' in elem.attrib or 'alt' in elem.attrib) \
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	347 and elem.tag in periods:
1 7870274479f5 Import of initial code base. cmlenz parents: diff changeset	348 continue
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	349 periods[elem.tag] = unicode(elem.text)
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	350
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	351 date_formats = data.setdefault('date_formats', {})
379 0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	352 for format in calendar.findall('dateFormats'):
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	353 for elem in format.getiterator():
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	354 if elem.tag == 'dateFormatLength':
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	355 if 'draft' in elem.attrib and \
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	356 elem.attrib.get('type') in date_formats:
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	357 continue
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	358 try:
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	359 date_formats[elem.attrib.get('type')] = \
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	360 dates.parse_pattern(unicode(elem.findtext('dateFormat/pattern')))
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	361 except ValueError, e:
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	362 print>>sys.stderr, 'ERROR: %s' % e
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	363 elif elem.tag == 'alias':
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	364 date_formats = Alias(_translate_alias(
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	365 ['date_formats'], elem.attrib['path'])
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	366 )
1 7870274479f5 Import of initial code base. cmlenz parents: diff changeset	367
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	368 time_formats = data.setdefault('time_formats', {})
379 0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	369 for format in calendar.findall('timeFormats'):
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	370 for elem in format.getiterator():
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	371 if elem.tag == 'timeFormatLength':
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	372 if ('draft' in elem.attrib or 'alt' in elem.attrib) \
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	373 and elem.attrib.get('type') in time_formats:
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	374 continue
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	375 try:
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	376 time_formats[elem.attrib.get('type')] = \
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	377 dates.parse_pattern(unicode(elem.findtext('timeFormat/pattern')))
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	378 except ValueError, e:
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	379 print>>sys.stderr, 'ERROR: %s' % e
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	380 elif elem.tag == 'alias':
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	381 time_formats = Alias(_translate_alias(
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	382 ['time_formats'], elem.attrib['path'])
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	383 )
1 7870274479f5 Import of initial code base. cmlenz parents: diff changeset	384
33 75a64f5a176e * Import datetime patterns from CLDR. cmlenz parents: 28 diff changeset	385 datetime_formats = data.setdefault('datetime_formats', {})
379 0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	386 for format in calendar.findall('dateTimeFormats'):
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	387 for elem in format.getiterator():
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	388 if elem.tag == 'dateTimeFormatLength':
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	389 if ('draft' in elem.attrib or 'alt' in elem.attrib) \
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	390 and elem.attrib.get('type') in datetime_formats:
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	391 continue
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	392 try:
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	393 datetime_formats[elem.attrib.get('type')] = \
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	394 unicode(elem.findtext('dateTimeFormat/pattern'))
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	395 except ValueError, e:
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	396 print>>sys.stderr, 'ERROR: %s' % e
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	397 elif elem.tag == 'alias':
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	398 datetime_formats = Alias(_translate_alias(
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	399 ['datetime_formats'], elem.attrib['path'])
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	400 )
33 75a64f5a176e * Import datetime patterns from CLDR. cmlenz parents: 28 diff changeset	401
1 7870274479f5 Import of initial code base. cmlenz parents: diff changeset	402 # <numbers>
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	403
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	404 number_symbols = data.setdefault('number_symbols', {})
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	405 for elem in tree.findall('//numbers/symbols/*'):
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	406 number_symbols[elem.tag] = unicode(elem.text)
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	407
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	408 decimal_formats = data.setdefault('decimal_formats', {})
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	409 for elem in tree.findall('//decimalFormats/decimalFormatLength'):
379 0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	410 if ('draft' in elem.attrib or 'alt' in elem.attrib) \
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	411 and elem.attrib.get('type') in decimal_formats:
1 7870274479f5 Import of initial code base. cmlenz parents: diff changeset	412 continue
26 6041782ea677 * Reduce size of locale data pickles by only storing the data provided by each locale itself, and merging inherited data at runtime. cmlenz parents: 22 diff changeset	413 pattern = unicode(elem.findtext('decimalFormat/pattern'))
6041782ea677 * Reduce size of locale data pickles by only storing the data provided by each locale itself, and merging inherited data at runtime. cmlenz parents: 22 diff changeset	414 decimal_formats[elem.attrib.get('type')] = numbers.parse_pattern(pattern)
1 7870274479f5 Import of initial code base. cmlenz parents: diff changeset	415
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	416 scientific_formats = data.setdefault('scientific_formats', {})
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	417 for elem in tree.findall('//scientificFormats/scientificFormatLength'):
379 0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	418 if ('draft' in elem.attrib or 'alt' in elem.attrib) \
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	419 and elem.attrib.get('type') in scientific_formats:
1 7870274479f5 Import of initial code base. cmlenz parents: diff changeset	420 continue
125 061ea0e0ac8c Add currency formatting. cmlenz parents: 65 diff changeset	421 pattern = unicode(elem.findtext('scientificFormat/pattern'))
061ea0e0ac8c Add currency formatting. cmlenz parents: 65 diff changeset	422 scientific_formats[elem.attrib.get('type')] = numbers.parse_pattern(pattern)
1 7870274479f5 Import of initial code base. cmlenz parents: diff changeset	423
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	424 currency_formats = data.setdefault('currency_formats', {})
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	425 for elem in tree.findall('//currencyFormats/currencyFormatLength'):
379 0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	426 if ('draft' in elem.attrib or 'alt' in elem.attrib) \
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	427 and elem.attrib.get('type') in currency_formats:
1 7870274479f5 Import of initial code base. cmlenz parents: diff changeset	428 continue
125 061ea0e0ac8c Add currency formatting. cmlenz parents: 65 diff changeset	429 pattern = unicode(elem.findtext('currencyFormat/pattern'))
061ea0e0ac8c Add currency formatting. cmlenz parents: 65 diff changeset	430 currency_formats[elem.attrib.get('type')] = numbers.parse_pattern(pattern)
1 7870274479f5 Import of initial code base. cmlenz parents: diff changeset	431
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	432 percent_formats = data.setdefault('percent_formats', {})
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	433 for elem in tree.findall('//percentFormats/percentFormatLength'):
379 0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	434 if ('draft' in elem.attrib or 'alt' in elem.attrib) \
0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	435 and elem.attrib.get('type') in percent_formats:
1 7870274479f5 Import of initial code base. cmlenz parents: diff changeset	436 continue
26 6041782ea677 * Reduce size of locale data pickles by only storing the data provided by each locale itself, and merging inherited data at runtime. cmlenz parents: 22 diff changeset	437 pattern = unicode(elem.findtext('percentFormat/pattern'))
6041782ea677 * Reduce size of locale data pickles by only storing the data provided by each locale itself, and merging inherited data at runtime. cmlenz parents: 22 diff changeset	438 percent_formats[elem.attrib.get('type')] = numbers.parse_pattern(pattern)
1 7870274479f5 Import of initial code base. cmlenz parents: diff changeset	439
26 6041782ea677 * Reduce size of locale data pickles by only storing the data provided by each locale itself, and merging inherited data at runtime. cmlenz parents: 22 diff changeset	440 currency_names = data.setdefault('currency_names', {})
6041782ea677 * Reduce size of locale data pickles by only storing the data provided by each locale itself, and merging inherited data at runtime. cmlenz parents: 22 diff changeset	441 currency_symbols = data.setdefault('currency_symbols', {})
1 7870274479f5 Import of initial code base. cmlenz parents: diff changeset	442 for elem in tree.findall('//currencies/currency'):
389 a11564c5c1f1 Ported [424], [425], and [428] back to 0.9.x branch. cmlenz parents: 379 diff changeset	443 code = elem.attrib['type']
a11564c5c1f1 Ported [424], [425], and [428] back to 0.9.x branch. cmlenz parents: 379 diff changeset	444 # TODO: support plural rules for currency name selection
a11564c5c1f1 Ported [424], [425], and [428] back to 0.9.x branch. cmlenz parents: 379 diff changeset	445 for name in elem.findall('displayName'):
a11564c5c1f1 Ported [424], [425], and [428] back to 0.9.x branch. cmlenz parents: 379 diff changeset	446 if ('draft' in name.attrib or 'count' in name.attrib) \
a11564c5c1f1 Ported [424], [425], and [428] back to 0.9.x branch. cmlenz parents: 379 diff changeset	447 and code in currency_names:
a11564c5c1f1 Ported [424], [425], and [428] back to 0.9.x branch. cmlenz parents: 379 diff changeset	448 continue
a11564c5c1f1 Ported [424], [425], and [428] back to 0.9.x branch. cmlenz parents: 379 diff changeset	449 currency_names[code] = unicode(name.text)
a11564c5c1f1 Ported [424], [425], and [428] back to 0.9.x branch. cmlenz parents: 379 diff changeset	450 # TODO: support choice patterns for currency symbol selection
a11564c5c1f1 Ported [424], [425], and [428] back to 0.9.x branch. cmlenz parents: 379 diff changeset	451 symbol = elem.find('symbol')
a11564c5c1f1 Ported [424], [425], and [428] back to 0.9.x branch. cmlenz parents: 379 diff changeset	452 if symbol is not None and 'draft' not in symbol.attrib \
a11564c5c1f1 Ported [424], [425], and [428] back to 0.9.x branch. cmlenz parents: 379 diff changeset	453 and 'choice' not in symbol.attrib:
a11564c5c1f1 Ported [424], [425], and [428] back to 0.9.x branch. cmlenz parents: 379 diff changeset	454 currency_symbols[code] = unicode(symbol.text)
1 7870274479f5 Import of initial code base. cmlenz parents: diff changeset	455
233 da97a3138239 Upgraded to CLDR 1.5 and improved timezone formatting. cmlenz parents: 125 diff changeset	456 outfile = open(os.path.join(destdir, 'localedata', stem + '.dat'), 'wb')
1 7870274479f5 Import of initial code base. cmlenz parents: diff changeset	457 try:
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	458 pickle.dump(data, outfile, 2)
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	459 finally:
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	460 outfile.close()
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	461
379 0a0bc1639ea7 Ported [407:415/trunk] back to 0.9.x branch. cmlenz parents: 346 diff changeset	462
1 7870274479f5 Import of initial code base. cmlenz parents: diff changeset	463 if __name__ == '__main__':
7870274479f5 Import of initial code base. cmlenz parents: diff changeset	464 main()

Mercurial > babel > mirror

annotate scripts/import_cldr.py @ 400:cdf6daa1e3cc stable-0.9.x