This file contains a contains the high-level functions to read a VOTable file.
Parses a VOTABLE xml file (or file-like object), and returns a VOTable object, with a nested list of Resource instances and Table instances.
source may be a filename or a readable file-like object.
If the columns parameter is specified, it should be a list of field names to include in the output. The default is to include all fields.
The invalid parameter may be one of the following values:
- ‘exception’: throw an exception when an invalid value is encountered (default)
- ‘mask’: mask out invalid values
When pedantic is True, raise an error when the file violates the spec, otherwise issue a warning. Warnings may be controlled using the standard Python mechanisms. See the warnings module in the Python standard library for more information.
chunk_size is the number of rows to read before converting to an array. Higher numbers are likely to be faster, but will consume more memory.
table_number is the number of table in the file to read in. If None, all tables will be read. If a number, 0 refers to the first table in the file.
filename is a filename, URL or other identifier to use in error messages. If filename is None and source is a string (i.e. a path), then source will be used as a filename for error messages.
This file defines the nodes that make up the VOTABLE XML tree.
Bases: vo.tree.SimpleElement
A class representing the COOSYS element, which defines a coordinate system.
The keyword arguments correspond to setting members of the same name, documented below.
[required] The XML ID of the COOSYS element, used for cross-referencing. May be None or a string conforming to XML ID syntax.
Specifies the epoch of the positions. It must be a string specifying an astronomical year.
A parameter required to fix the equatorial or ecliptic systems (as e.g. “J2000” as the default “eq_FK5” or “B1950” as the default “eq_FK4”).
Specifies the type of coordinate system. Valid choices are:
‘eq_FK4’, ‘eq_FK5’, ‘ICRS’, ‘ecl_FK4’, ‘ecl_FK5’, ‘galactic’, ‘supergalactic’, ‘xy’, ‘barycentric’, or ‘geo_app’
Bases: object
A base class for all classes that represent XML elements in the VOTABLE file.
Bases: vo.tree.SimpleElement
A class that represents the FIELD element, which describes the datatype of a particular column of data.
The keyword arguments correspond to setting members of the same name, documented below.
If ID is provided, it is used for the column name in the resulting recarray of the table. If no ID is provided, name is used instead. If neither is provided, an exception will be raised.
Make sure that all names and titles in a list of fields are unique, by appending numbers if necessary.
The XML ID of the FIELD element, used for cross-referencing. May be None or a string conforming to XML ID syntax.
Specifies the size of the multidimensional array if this FIELD contains more than a single value.
[required] The datatype of the column. Valid values (as defined by the spec) are:
‘boolean’, ‘bit’, ‘unsignedByte’, ‘short’, ‘int’, ‘long’, ‘char’, ‘unicodeChar’, ‘float’, ‘double’, ‘floatComplex’, or ‘doubleComplex’
Many VOTABLE files in the wild use ‘string’ instead of ‘char’, so that is also a valid option, though ‘string’ will always be converted to ‘char’ when writing the file back out.
A list of Link instances used to reference more details about the meaning of the FIELD. This is purely informational and is not used by the vo package.
Along with width, defines the numerical accuracy associated with the data. These values are used to limit the precision when writing floating point values back to the XML file. Otherwise, it is purely informational – the Numpy recarray containing the data itself does not use this information.
On FIELD elements, ref is used only for informational purposes, for example to refer to a COOSYS element.
The unified content descriptor for the FIELD.
The usage-specific or unique type of the FIELD.
Along with precision, defines the numerical accuracy associated with the data. These values are used to limit the precision when writing floating point values back to the XML file. Otherwise, it is purely informational – the Numpy recarray containing the data itself does not use this information.
Extended data type information.
Bases: vo.tree.SimpleElement
A class representing the FIELDref element, which is used inside of GROUP elements to refer to FIELD elements defined elsewhere.
table is the Table object that this FieldRef is a member of.
ref is the ID to reference a Field object defined elsewhere.
The unified content descriptor for the FIELDref.
The usage-specific or unique type of the FIELDref.
Bases: vo.tree.Element
Stores information about the grouping of FIELD and PARAM elements.
This information is currently ignored by the vo package—that is the columns in the recarray are always flat—but the grouping information is stored so that it can be written out again to the XML file.
The keyword arguments correspond to setting members of the same name, documented below.
An optional string describing the GROUP. Corresponds to the DESCRIPTION element.
A list of members of the GROUP. This list may only contain objects of type Param, Group, ParamRef and FieldRef.
An optional name for the grouping.
Currently ignored, as it’s not clear from the spec how this is meant to work.
The unified content descriptor for the GROUP.
The usage-specific or unique type of the GROUP.
Bases: vo.tree.SimpleElementWithContent
A class for storing INFO elements, which contain arbitrary key-value pairs for extensions to the standard.
The keyword arguments correspond to setting members of the same name, documented below.
The content inside the INFO element.
[required] The key of the key-value pair.
The usage-specific or unique type of the INFO.
The value of the key-value pair. (Always stored as a string or unicode string).
Extended data type information.
Bases: vo.tree.SimpleElement
A class for storing LINK elements, which are used to reference external documents and servers through a URI.
The keyword arguments correspond to setting members of the same name, documented below.
Defines the MIME role of the referenced object. Must be one of:
None, ‘query’, ‘hints’, ‘doc’ or ‘location’
Defines the MIME content type of the referenced object.
A URI to an arbitrary protocol. The vo package only supports http and anonymous ftp.
Bases: vo.tree.Field
A class to represent the PARAM element, which are constant-valued columns in the data.
Param objects are a subclass of Field, and have all of its methods and members. Additionally, it defines value.
Bases: vo.tree.SimpleElement
A class representing the PARAMref element, which is used inside of GROUP elements to refer to PARAM elements defined elsewhere.
The keyword arguments correspond to setting members of the same name, documented below.
It contains the following publicly-accessible members:
ref: An XML ID refering to a <PARAM> element.
The unified content descriptor for the PARAMref.
The usage-specific or unique type of the PARAMref.
Bases: vo.tree.Element
A class to store the information in a RESOURCE element. Each resource may contain zero-or-more TABLE elements and zero-or-more nested RESOURCE elements.
The keyword arguments correspond to setting members of the same name, documented below.
Recursively iterates over all the COOSYS elements in the resource and nested resources.
Recursively iterates over all FIELD and PARAM elements in the resource, its tables and nested resources.
Recursively iterates over all tables in the resource and nested resources.
The XML ID of the RESOURCE element, used for cross-referencing. May be None or a string conforming to XML ID syntax.
A list of coordinate system definitions (COOSYS elements) for the RESOURCE. Must contain only CooSys objects.
An optional string describing the RESOURCE. Corresponds to the DESCRIPTION element.
A dictionary of string keys to string values containing any extra attributes of the RESOURCE element that are not defined in the specification. (The specification explicitly allows for extra attributes here, but nowhere else.)
A list of informational parameters (key-value pairs) for the resource. Must only contain Info objects.
A list of links (pointers to other documents or servers through a URI) for the resource. Must contain only Link objects.
A list of parameters (constant-valued columns) for the resource. Must contain only Param objects.
[required] The type of the resource. Must be either:
- ‘results’: This resource contains actual result values (default)
- ‘meta’: This resource contains only datatype descriptions (FIELD elements), but no actual data.
The usage-specific or unique type of the FIELD.
Bases: vo.tree.Element
A base class for simple elements, such as FIELD, PARAM and INFO that don’t require any special parsing or outputting machinery.
Bases: vo.tree.SimpleElement
A base class for simple elements, such as FIELD, PARAM and INFO that don’t require any special parsing or outputting machinery.
The content of the element.
Bases: vo.tree.Element
A class to store a TABLE element, which optionally contains data.
It contains the following publicly-accessible members, all of which are mutable:
array: A Numpy recarray of the data itself, where each row is a row of votable data, and columns are named and typed based on the <FIELD> elements of the table.
mask: A Numpy recarray of only boolean values, set to True wherever a value is undefined.
If the Table contains no data, (for example, its enclosing Resource has type == ‘meta’) array and mask will be zero-length arrays.
Note
In a future version of the vo package, the array and mask elements will likely be combined into a single Numpy masked record array. However, there are a number of deficiencies the current implementation of Numpy that prevent this.
The keyword arguments correspond to setting members of the same name, documented below.
Create new arrays to hold the data based on the current set of fields, and store them in the array and mask member variables. Any data in existing arrays will be lost.
nrows, if provided, is the number of rows to allocate.
Looks up a FIELD or PARAM element by the given ID.
Looks up a FIELD or PARAM element by the given ID or name.
Looks up a GROUP element by the given ID. Used by the group’s “ref” attribute
Returns True if this table doesn’t contain any real data because it was skipped over by the parser (through use of the table_number kwarg).
Recursively iterate over all FIELD and PARAM elements in the TABLE.
Recursively iterate over all GROUP elements in the TABLE.
The XML ID of the TABLE element, used for cross-referencing. May be None or a string conforming to XML ID syntax.
An optional string describing the TABLE. Corresponds to the DESCRIPTION element.
[required] The serialization format of the table. Must be one of:
Note that the ‘fits’ format, since it requires an external file, can not be written out. Any file read in with ‘fits’ format will be read out, by default, in ‘tabledata’ format.
A list of Group objects describing how the columns and parameters are grouped. Currently this information is only kept around for round-tripping and informational purposes.
An optional name for the table.
[immutable] The number of rows in the table, as specified in the XML file.
A list of parameters (constant-valued columns) for the table. Must contain only Param objects.
Refer to another TABLE, previously defined, by the ref ID for all metadata (FIELD, PARAM etc.) information.
The unified content descriptor for the TABLE.
Bases: vo.tree.Element
A class to represent the top-level VOTABLE element.
The keyword arguments correspond to setting members of the same name, documented below.
version is settable at construction time only, since conformance tests for building the rest of the structure depend on it.
Looks up a COOSYS element by the given ID.
Looks up a FIELD element by the given ID. Used by the field’s “ref” attribute.
Looks up a FIELD element by the given ID or name.
Often, you know there is only one table in the file, and that’s all you need. This method returns that first table.
Looks up a GROUP element by the given ID. Used by the group’s “ref” attribute
Looks up a TABLE element by the given ID. Used by the table “ref” attribute.
Get a table by its ordinal position in the file.
Looks up a VALUES element by the given ID. Used by the values “ref” attribute.
Recursively iterate over all COOSYS elements in the VOTABLE file.
Recursively iterate over all FIELD and PARAM elements in the VOTABLE file.
Recursively iterate over all GROUP elements in the VOTABLE file.
Iterates over all tables in the VOTable file in a “flat” way, ignoring the nesting of resources etc.
Set the output storage format of all tables in the file.
Write to an XML file.
fd may be a path to a file, or a Python file-like object.
The XML ID of the VOTABLE element, used for cross-referencing. May be None or a string conforming to XML ID syntax.
A list of coordinate system descriptions for the file. Must contain only CooSys objects.
An optional string describing the VOTABLE. Corresponds to the DESCRIPTION element.
A list of groups, in the order they appear in the file. Only supported as a child of the VOTABLE element in VOTable 1.2 or later.
A list of informational parameters (key-value pairs) for the entire file. Must only contain Info objects.
A list of parameters (constant-valued columns) that apply to the entire file. Must contain only Param objects.
A list of resources, in the order they appear in the file. Must only contain Resource objects.
The version of the VOTable specification that the file uses.
Bases: vo.tree.Element
A class to represent the VALUES element, used within FIELD and PARAM elements to define the domain of values.
The keyword arguments correspond to setting members of the same name, documented below.
The XML ID of the VALUES element, used for cross-referencing. May be None or a string conforming to XML ID syntax.
The maximum value of the domain. See max_inclusive.
When True, the domain includes the maximum value.
The minimum value of the domain. See min_inclusive.
When True, the domain includes the minimum value.
For integral datatypes, null is used to define the value used for missing values.
A list of string key-value tuples defining other OPTION elements for the domain. All options are ignored – they are stored for round-tripping purposes only.
Raises a VOTableSpecError if year is not a valid astronomical year as defined by the VOTABLE standard.
Raises a VOTableSpecError if string is not a string or Unicode string.
Warns or raises a VOTableSpecError if ucd is not a valid unified content descriptor string as defined by the VOTABLE standard.
This module handles the conversion of various VOTABLE datatypes to/from TABLEDATA and BINARY formats.
Bases: vo.converters.Converter
Handles both fixed and variable-lengths arrays
Bases: vo.converters.VarArray
Handles an array of variable-length arrays, i.e. where arraysize ends in ‘*’.
Bases: vo.converters.Converter
Handles the bit datatype.
alias of ScalarVarArray
Bases: vo.converters.NumericArray
Handles an array of bits.
alias of ArrayVarArray
Bases: vo.converters.Converter
Handles the boolean datatype.
alias of BooleanArray
Bases: vo.converters.NumericArray
Handles an array of boolean values.
Bases: vo.converters.Converter
Handles the char datatype. (7-bit unsigned characters)
Missing values are not handled for string or unicode types.
Bases: vo.converters.FloatingPoint, vo.converters.Array
The base class for complex numbers.
alias of ComplexArray
alias of ComplexVarArray
Bases: vo.converters.NumericArray
Handles a fixed-size array of complex numbers.
alias of ComplexArrayVarArray
Bases: vo.converters.VarArray
Handles an array of variable-length arrays of complex numbers.
Bases: vo.converters.VarArray
Handles a variable-length array of complex numbers.
Bases: object
The base class for all converters. Each subclass handles converting a specific VOTABLE data type to/from the TABLEDATA and BINARY on-disk representations.
Convert the object value in the native in-memory datatype to a string of bytes suitable for serialization in the BINARY format.
Reads some number of bytes from the BINARY format representation by calling the function read, and returns the native in-memory object representation for the datatype handled by self. The result is returned as a tuple (value, mask).
Convert the object value in the native in-memory datatype to a string suitable for serializing in the TABLEDATA format. If mask is True, will return the string representation of a masked value.
Convert the string value from the TABLEDATA format into an object with the correct native in-memory datatype. The result is returned as a tuple (value, mask).
Parse a single scalar of the underlying type of the converter. For non-array converters, this is equivalent to parse. For array converters, this is equivalent to parsing a single element of the array. The result is returned as a tuple (value, mask).
Bases: vo.converters.FloatingPoint
Handles the double datatype. Double-precision IEEE floating-point.
Bases: vo.converters.Complex
Handle doubleComplex datatype. Pair of double-precision IEEE floating-point numbers.
Bases: vo.converters.FloatingPoint
Handles the float datatype. Single-precision IEEE floating-point.
Bases: vo.converters.Complex
Handle floatComplex datatype. Pair of single-precision IEEE floating-point numbers.
Bases: vo.converters.Numeric
The base class for floating-point datatypes.
Bases: vo.converters.Integer
Handles the int datatype. Signed 32-bit integer.
Bases: vo.converters.Numeric
The base class for all the integral datatypes.
Bases: vo.converters.Integer
Handles the long datatype. Signed 64-bit integer.
Bases: vo.converters.Converter
The base class for all numeric data types.
alias of NumericArray
alias of ScalarVarArray
Bases: vo.converters.Array
Handles a fixed-length array of numeric scalars.
alias of ArrayVarArray
Bases: vo.converters.VarArray
Handles a variable-length array of numeric scalars.
Bases: vo.converters.Integer
Handles the short datatype. Signed 16-bit integer.
Bases: vo.converters.Converter
Handles the unicodeChar data type. UTF-16-BE.
Missing values are not handled for string or unicode types.
Bases: vo.converters.Integer
Handles the unsignedByte datatype. Unsigned 8-bit integer.
Bases: vo.converters.Array
Handles variable lengths arrays (i.e. where arraysize is ‘*’).
Given a Field object field, return an appropriate converter class to handle the specified datatype.
A regex to handle splitting values on either whitespace or commas.
SPEC: Usage of commas is not actually allowed by the spec, but many files in the wild use them.
Exceptions and warnings used by the vo package.
Bases: object
The base class of all VO warnings and exceptions. Handles the formatting of the message with a warning or exception code, filename, line and column number.
Warn or raise an exception, depending on the pedantic setting.
Raise an exception, with proper position information if available.
Raise an exception, with proper position information if available.
Restores the original traceback of the exception, and should only be called within an “except:” block of code.
Warn, with proper position information if available.
Parses the vo warning string back into its parts.
This file contains routines to verify the correctness of UCD strings.
A class to manage the list of acceptable UCD words. Works by reading in a data file exactly as provided by IVOA. This file resides in data/ucd1p-words.txt.
Returns the official English description of the given UCD name.
Returns True if name is a valid primary name.
Returns True if name is a valid secondary name.
Returns the standard capitalization form of the given name.
Returns False if ucd is not a valid unified content descriptor.
If check_controlled_vocabulary is True, then each word in the UCD will be verified against the UCD1+ controlled vocabulary, (as required by the VOTable specification version 1.2), otherwise not.
Parse the UCD into its component parts. The result is a list of tuples of the form:
(namespace, word)
If no namespace was explicitly specified, namespace will be returned as 'ivoa' (i.e., the default namespace).
If check_controlled_vocabulary is True, then each word in the UCD will be verified against the UCD1+ controlled vocabulary, (as required by the VOTable specification version 1.2), otherwise not.
Will raise ValueError if ucd is invalid.
Various utilities and cookbook-like things.
Bases: list
A subclass of list that contains only elements of a given type or types.
types is a sequence of acceptable types.
values (optional) is an initial set of values.
A class to display a progress bar in the terminal.
It is designed for use with the with statement:
with ProgressBar(len(items)) as bar:
for i, item in enumerate(items):
bar.update(i)
total is the number of steps in the process.
Update the progress bar to the given value (out of the total given to the constructor.
Bases: dict
A dictionary with a maximum size.
Bases: object
Decorator that caches a function’s return value each time it is called. If called later with the same arguments, the cached value is returned, and not re-evaluated.
Coerces and/or verifies the object p into a valid range-list-format parameter as defined in Section 8.7.2 of Simple Spectral Access Protocol.
p may be a string as passed verbatim to the service expecting a range-list, or a sequence. If a sequence, each item must be either:
- a numeric value
- a named value, such as, for example, ‘J’ for named spectrum (if the numeric kwarg is False)
- a 2-tuple indicating a range
- the last item my be a string indicating the frame of reference
The result is a tuple:
- a string suitable for passing to a service as a range-list argument
- an integer counting the number of elements
frames, if provided, should be a sequence of acceptable frame of reference keywords.
Prints the string s in the given color where color is an ANSI terminal color name. stream may be any writable file-like object.
TODO: Be smart about when to use and not use color.
Returns a function suitable for streaming input, or a file object.
fd may be:
- a file object, in which case it is returned verbatim.
- a function that reads from a stream, in which case it is returned verbatim.
- a file path, in which case it is opened. If it ends in gz, it is assumed to be a gzipped file, and the read() method on the file object is returned. Otherwise, the raw file object is returned.
- an object with a read() method, in which case that method is returned.
Returns a writable file-like object suitable for streaming output.
fd may be:
- a file path, in which case it is opened, and the write() method on the file object is returned.
- an object with a write() method, in which case that method is returned.
Like dict.update, except if the values in u are None, d is not updated.
Abstracts away the different ways to test for a callable object in Python 2.x and 3.x.
Does a map while displaying a progress bar with percentage complete.
Compare two version identifiers.
Various XML-related utilities
file is a writable file-like object.
Closes open elements, up to (and including) the element identified by the given identifier.
id: Element identifier, as returned by the start method.
Adds a comment to the output stream.
comment: Comment text, as a Unicode string.
Adds character data to the output stream.
text: Character data, as a Unicode string.
Adds an entire element. This is the same as calling start, data, and end in sequence. The text argument can be omitted.
Closes the current element (opened by the most recent call to start).
tag: Element tag. If given, the tag must match the start tag. If omitted, the current element is closed.
Returns the number of indentation levels the file is currently in.
Returns a string of spaces that matches the current indentation level.
Opens a new element. Attributes can be given as keyword arguments, or as a string/string dictionary. The method returns an opaque identifier that can be passed to the close() method, to close all open elements up to and including this one.
tag: Element tag.
attrib: Attribute dictionary. Alternatively, attributes can be given as keyword arguments.
Returns an element identifier.
Raises a ValueError if uri is not a valid URI as defined in RFC 2396.
Raises a VOTableSpecError if ID is not a valid XML ID. name is the name of the attribute being checked (used only for error messages).
Raises a ValueError if content_type is not a valid MIME content type (syntactically at least), as defined by RFC 2045.
Raises a ValueError if token is not a valid XML token, as defined by XML Schema Part 2.
Given an arbitrary string, create one that can be used as an xml id. This is rather simplistic at the moment, since it just replaces non-valid characters with underscores.
Returns an iterator over the elements of an XML file. See fast_iterparse for more information.
Converts an object with a bunch of attributes on an object into a dictionary for use by the XMLWriter.
obj is any Python object
attrs is a sequence of attribute names to pull from the object
If any of the attributes is None, it will not appear in the output dictionary.
Based on cElementTree.iterparse, but doesn’t ever build a tree at all. This makes things much faster and more memory efficient.
The iterator returns 3-tuples (event, tag, data):
Validates the given file against the appropriate VOTable schema corresponding to the given version, which must be a string “1.0”, “1.1”, or “1.2”.
For version “1.0”, it is checked against a DTD, since that version did not have an XML Schema.