Skip to main content
Some of the registers Topograph integrates return company data in non-Latin scripts: Bulgarian and Ukrainian in Cyrillic, Greek in Greek, Chinese (mainland and Hong Kong) in Simplified Chinese. When the client application only handles Latin characters, you can ask the API to romanize the response payload on the way out. Transliteration is opt-in. Pass transliterate=true as a query parameter on /v2/search or /v2/company, and the response comes back with user-facing string fields rewritten in Latin characters.

What gets transformed

The transform touches fields that carry user-facing text in the source language:
  • Company names: legalName, commercialNames, legacyLegalNames, legacyCommercialNames
  • Address components: addressLine1, addressLine2, city, state, region, careOf, poBox
  • Person names: title, firstName, middleName, lastName, fullName, suffix
  • Establishment, document, and graph node names: name
  • Activity, control, and relationship descriptions: description, activityDescription
  • Legal form, role, and status local labels: localName
  • Status explanations: additionalInfo
Fields that are already in Latin form, or that represent identifiers rather than text, are never transformed. That includes legalNameInEnglish, englishTranslation, standardized, iso20275Code, iso5009Code, countryCode, postalCode, id, dates, URLs, phone numbers, and any numeric or enumerated value. Strings that are already in Latin script pass through untouched regardless of which field they live on.

What does not change

The flag is a presentation toggle, not a data mutation:
  • Cache content on the Topograph side stays in the native script. Repeated calls with and without the flag hit the same cache.
  • Billing is unchanged. Transliteration is free.
  • Non-Latin fields are replaced with their romanized equivalents in the response. If you need the native-script value as well, issue a second call without the flag.

How routing works

The API picks a romanization strategy per string based on two signals: the Unicode script of the string and the country code of the record.
Script detectedLibrary usedStandard
Cyrillictransliteration (unidecode)BGN/PCGN-compatible
Greektransliteration (unidecode)ISO 843
Hangultransliteration (unidecode)Revised Romanization
Hebrew, Arabic, Thaitransliteration (unidecode)ALA-LC compatible
CJK with hiragana or katakanakuroshiro + kuromojiHepburn
CJK with kanji only, country = JPkuroshiro + kuromojiHepburn
CJK with kanji only, other countrypinyin-proHanyu Pinyin, toneless
Latin or ASCIIno-op.
Mixed-script strings are handled automatically. A string like "Nokia 中国" on a Chinese record becomes "Nokia Zhong Guo": the Latin run passes through untouched and the Chinese run is routed to pinyin.

Known limitations

Transliteration is a mechanical transform. It is not the same as translation, and it is not the same as the company’s own branding in Latin characters.
  • Japanese proper nouns are imperfect. Kuroshiro reads kanji through a morphological dictionary. For unusual personal and company names, the reading the dictionary picks may differ from the reading the company actually uses. If accuracy on a specific Japanese entity matters, cross-check against the source.
  • Chinese is character-level, not word-level. Output is spaced per syllable (Bei Jing Xiao Mi), not joined (Beijing Xiaomi). Word-level joining would require a separate segmentation pass and we do not do it today.
  • Source-provided Latin names are ignored. Some Hong Kong and Chinese records carry an official English name from the source register. The flag always produces a mechanical romanization of the native string, not that official English name. If you want the official English name, read legalNameInEnglish from the unflagged response: that field is never transliterated.
  • Streaming search is not transliterated. /v2/search?stream=true returns Server-Sent Events straight from the search pipeline. The flag is ignored in streaming mode. Use the standard JSON mode when you need transliteration.
  • Determinism across versions. The output of pinyin-pro and kuroshiro can change between library or dictionary versions. Library versions are pinned in the API image, so the output is stable within a release.

Examples

Search a Bulgarian company and get Latin results:
curl -H "x-api-key: $TOPOGRAPH_API_KEY" \
  "https://api.topograph.co/v2/search?country=BG&query=%D0%92%D0%B0%D0%B7%D1%80%D0%B0%D0%B6%D0%B4%D0%B0%D0%BD%D0%B5&transliterate=true"
Retrieve a Chinese company profile in Latin:
curl -H "x-api-key: $TOPOGRAPH_API_KEY" \
  -X POST "https://api.topograph.co/v2/company?transliterate=true" \
  -H "Content-Type: application/json" \
  -d '{"countryCode":"CN","id":"91310000MA1FL10P8Q","dataPoints":["company"]}'
Fetch a previously created request in Latin:
curl -H "x-api-key: $TOPOGRAPH_API_KEY" \
  "https://api.topograph.co/v2/company/253299d1-e8d0-4268-945b-f175f98bc114?transliterate=true"

When to use this

Reach for transliterate=true when:
  • Your downstream system (CRM, KYC engine, search index, spreadsheet export) only indexes Latin text.
  • You want to display company and officer names in a Western UI without glyph fallbacks.
  • You are matching names across jurisdictions and need a consistent Latin form on both sides.
Do not reach for it when:
  • You need the company’s own brand in Latin. Use legalNameInEnglish instead when the register provides it.
  • You need to preserve the native-script value alongside the Latin form. Issue a second call without the flag, or cache the unflagged response and call again with the flag only when rendering.