pyhanko.pdf_utils.metadata package
Submodules
pyhanko.pdf_utils.metadata.info module
- pyhanko.pdf_utils.metadata.info.update_info_dict(meta: DocumentMetadata, info: DictionaryObject, only_update_existing: bool = False) bool
- pyhanko.pdf_utils.metadata.info.view_from_info_dict(info_dict: DictionaryObject, strict: bool = True) DocumentMetadata
pyhanko.pdf_utils.metadata.model module
Added in version 0.14.0.
This module contains the XMP data model classes and namespace registry, in addition to a simplified document metadata model used for automated metadata management.
- pyhanko.pdf_utils.metadata.model.DC_CREATOR = http://purl.org/dc/elements/1.1/creator
creatorin thedcnamespace.
- pyhanko.pdf_utils.metadata.model.DC_DESCRIPTION = http://purl.org/dc/elements/1.1/description
descriptionin thedcnamespace.
- pyhanko.pdf_utils.metadata.model.DC_TITLE = http://purl.org/dc/elements/1.1/title
titlein thedcnamespace.
- pyhanko.pdf_utils.metadata.model.NS = {'dc': 'http://purl.org/dc/elements/1.1/', 'pdf': 'http://ns.adobe.com/pdf/1.3/', 'pdfaExtension': 'http://www.aiim.org/pdfa/ns/extension/', 'pdfaProperty': 'http://www.aiim.org/pdfa/ns/property#', 'pdfaSchema': 'http://www.aiim.org/pdfa/ns/schema#', 'pdfaid': 'http://www.aiim.org/pdfa/ns/id/', 'pdfuaid': 'http://www.aiim.org/pdfua/ns/id/', 'rdf': 'http://www.w3.org/1999/02/22-rdf-syntax-ns#', 'x': 'adobe:ns:meta/', 'xml': 'http://www.w3.org/XML/1998/namespace', 'xmp': 'http://ns.adobe.com/xap/1.0/'}
Known namespaces and their customary prefixes.
- pyhanko.pdf_utils.metadata.model.PDF_KEYWORDS = http://ns.adobe.com/pdf/1.3/keywords
keywordsin thepdfnamespace.
- pyhanko.pdf_utils.metadata.model.PDF_PRODUCER = http://ns.adobe.com/pdf/1.3/Producer
Producerin thepdfnamespace.
- pyhanko.pdf_utils.metadata.model.RDF_ABOUT = http://www.w3.org/1999/02/22-rdf-syntax-ns#about
aboutin therdfnamespace.
- pyhanko.pdf_utils.metadata.model.RDF_ALT = http://www.w3.org/1999/02/22-rdf-syntax-ns#Alt
Altin therdfnamespace.
- pyhanko.pdf_utils.metadata.model.RDF_BAG = http://www.w3.org/1999/02/22-rdf-syntax-ns#Bag
Bagin therdfnamespace.
- pyhanko.pdf_utils.metadata.model.RDF_DESCRIPTION = http://www.w3.org/1999/02/22-rdf-syntax-ns#Description
Descriptionin therdfnamespace.
- pyhanko.pdf_utils.metadata.model.RDF_LI = http://www.w3.org/1999/02/22-rdf-syntax-ns#li
liin therdfnamespace.
- pyhanko.pdf_utils.metadata.model.RDF_PARSE_TYPE = http://www.w3.org/1999/02/22-rdf-syntax-ns#parseType
parseTypein therdfnamespace.
- pyhanko.pdf_utils.metadata.model.RDF_RDF = http://www.w3.org/1999/02/22-rdf-syntax-ns#RDF
RDFin therdfnamespace.
- pyhanko.pdf_utils.metadata.model.RDF_RESOURCE = http://www.w3.org/1999/02/22-rdf-syntax-ns#resource
resourcein therdfnamespace.
- pyhanko.pdf_utils.metadata.model.RDF_SEQ = http://www.w3.org/1999/02/22-rdf-syntax-ns#Seq
Seqin therdfnamespace.
- pyhanko.pdf_utils.metadata.model.RDF_VALUE = http://www.w3.org/1999/02/22-rdf-syntax-ns#value
valuein therdfnamespace.
- pyhanko.pdf_utils.metadata.model.VENDOR = 'pyHanko 0.0.0.dev1'
pyHanko version identifier in textual form
- pyhanko.pdf_utils.metadata.model.XML_LANG = http://www.w3.org/XML/1998/namespace/lang
langin thexmlnamespace.
- pyhanko.pdf_utils.metadata.model.XMP_CREATEDATE = http://ns.adobe.com/xap/1.0/CreateDate
CreateDatein thexmpnamespace.
- pyhanko.pdf_utils.metadata.model.XMP_CREATORTOOL = http://ns.adobe.com/xap/1.0/CreatorTool
CreatorToolin thexmpnamespace.
- pyhanko.pdf_utils.metadata.model.XMP_MODDATE = http://ns.adobe.com/xap/1.0/ModifyDate
ModifyDatein thexmpnamespace.
- pyhanko.pdf_utils.metadata.model.X_XMPMETA = adobe:ns:meta/xmpmeta
xmpmetain thexnamespace.
- pyhanko.pdf_utils.metadata.model.X_XMPTK = adobe:ns:meta/xmptk
xmptkin thexnamespace.
- class pyhanko.pdf_utils.metadata.model.DocumentMetadata(title: StringWithLanguage | str | None = None, author: StringWithLanguage | str | None = None, subject: StringWithLanguage | str | None = None, keywords: List[str] = <factory>, creator: StringWithLanguage | str | None = None, created: str | datetime | None = None, last_modified: str | datetime | None = 'now', xmp_extra: List[XmpStructure] = <factory>, xmp_unmanaged: bool = False)
Bases:
objectSimple representation of document metadata. All entries are optional.
- title: StringWithLanguage | str | None = None
The document’s title.
- author: StringWithLanguage | str | None = None
The document’s author.
- subject: StringWithLanguage | str | None = None
The document’s subject.
- keywords: List[str]
Keywords associated with the document.
- creator: StringWithLanguage | str | None = None
The software that was used to author the document.
Note
This is distinct from the producer, which is typically used to indicate which PDF processor(s) interacted with the file.
- created: str | datetime | None = None
The time when the document was created. To set it to the current time, specify
now.
- last_modified: str | datetime | None = 'now'
The time when the document was last modified. Defaults to the current time upon serialisation if not specified.
- xmp_extra: List[XmpStructure]
Extra XMP metadata.
- xmp_unmanaged: bool = False
Flag metadata as XMP-only. This means that the info dictionary will be cleared out as much as possible, and that all attributes other than
xmp_extrawill be ignored when updating XMP metadata.Note
The last-modified date and producer entries in the info dictionary will still be updated.
Note
DocumentMetadatarepresents a data model that is much more simple than what XMP is actually capable of. You can use this flag if you need more fine-grained control.
- view_over(base: DocumentMetadata)
- class pyhanko.pdf_utils.metadata.model.ExpandedName(ns: str, local_name: str)
Bases:
objectAn expanded XML name.
- ns: str
The URI of the namespace in which the name resides.
- local_name: str
The local part of the name.
- pyhanko.pdf_utils.metadata.model.MetaString
A regular string, a string with a language code, or nothing at all.
alias of
StringWithLanguage|str|None
- class pyhanko.pdf_utils.metadata.model.Qualifiers(quals: Dict[ExpandedName, XmpValue])
Bases:
objectXMP value qualifiers wrapper. Implements
__getitem__. Note thatxml:langgets special treatment.- Parameters:
quals – The qualifiers to model.
- classmethod of(*lst: Tuple[ExpandedName, XmpValue]) Qualifiers
Construct a
Qualifiersobject from a list of name-value pairs.- Parameters:
lst – A list of name-value pairs.
- Returns:
A
Qualifiersobject.
- classmethod lang_as_qual(lang: str | None) Qualifiers
Construct a
Qualifiersobject that only wraps a language qualifier.- Parameters:
lang – A language code.
- Returns:
A
Qualifiersobject.
- iter_quals(with_lang: bool = True) Iterable[Tuple[ExpandedName, XmpValue]]
Iterate over all qualifiers.
- Parameters:
with_lang – Include the language qualifier.
- Returns:
- property lang: str | None
Retrieve the language qualifier, if any.
- property has_non_lang_quals: bool
Check if there are any non-language qualifiers.
- class pyhanko.pdf_utils.metadata.model.XmpArray(array_type: XmpArrayType, entries: List[XmpValue])
Bases:
objectAn XMP array.
- array_type: XmpArrayType
The type of the array.
- classmethod ordered(lst: Iterable[XmpValue]) XmpArray
Convert a list to an ordered XMP array.
- Parameters:
lst – An iterable of XMP values.
- Returns:
An ordered
XmpArray.
- class pyhanko.pdf_utils.metadata.model.XmpArrayType(*values)
Bases:
EnumXMP array types.
- ORDERED = 'Seq'
Ordered array.
- UNORDERED = 'Bag'
Unordered array.
- ALTERNATIVE = 'Alt'
Alternative array.
- as_rdf() ExpandedName
Render the type as an XML name.
- class pyhanko.pdf_utils.metadata.model.XmpStructure(fields: Dict[ExpandedName, XmpValue])
Bases:
objectA generic XMP structure value. Implements
__getitem__for field access.- Parameters:
fields – The structure’s fields.
- classmethod of(*lst: Tuple[ExpandedName, XmpValue]) XmpStructure
Construct an
XmpStructurefrom a list of name-value pairs.- Parameters:
lst – A list of name-value pairs.
- Returns:
An an
XmpStructure.
- class pyhanko.pdf_utils.metadata.model.XmpValue(value: XmpStructure | XmpArray | XmpUri | str, qualifiers: Qualifiers = <factory>)
Bases:
objectA general XMP value, potentially with qualifiers.
- value: XmpStructure | XmpArray | XmpUri | str
The value.
- qualifiers: Qualifiers
Qualifiers that apply to the value.
pyhanko.pdf_utils.metadata.xmp_xml module
- pyhanko.pdf_utils.metadata.xmp_xml.iter_attrs(elem: _Element) Iterator[Tuple[ExpandedName, str]]
- pyhanko.pdf_utils.metadata.xmp_xml.serialise_xmp(roots: List[XmpStructure], out: BinaryIO)
- class pyhanko.pdf_utils.metadata.xmp_xml.MetadataStream(dict_data: dict | None = None, stream_data: bytes | None = None, encoded_data: bytes | None = None, handler: SecurityHandler | None = None)
Bases:
StreamObject- classmethod from_xmp(xmp: List[XmpStructure]) MetadataStream
- property xmp: List[XmpStructure]
- update_xmp_with_meta(meta: DocumentMetadata)
- pyhanko.pdf_utils.metadata.xmp_xml.update_xmp_with_meta(meta: DocumentMetadata, roots: Iterable[XmpStructure] = ())
- pyhanko.pdf_utils.metadata.xmp_xml.meta_from_xmp(roots: List[XmpStructure])
- exception pyhanko.pdf_utils.metadata.xmp_xml.XmpXmlProcessingError
Bases:
ValueError
- pyhanko.pdf_utils.metadata.xmp_xml.parse_xmp(inp: BinaryIO) List[XmpStructure]
- pyhanko.pdf_utils.metadata.xmp_xml.register_namespaces()