MIRACLE

Existing data for South Korea's developmental period is either too coarse (province-level) or too infrequent (census every 5–10 years) for studying how rapid growth transformed local economies. MIRACLE digitises municipal statistical yearbooks into the first annual, township-level economic panel covering 1960–1989 — with near-universal geographic coverage from ~2 million pages of county archives.

~3,500 townships
30 annual panels
100+ variables
~2M pages of archives
About MIRACLE

MIRACLE is a township-year panel linking demographic, agricultural, industrial, fiscal, and infrastructure data for South Korea's ~3,500 townships across the 1960–1989 period. Each observation is tied to a time-consistent identifier (miracle_id) that tracks townships through two major boundary reorganisations (1963, 1973). Nine thematic modules — from population and paddy area to school counts and road kilometres — are harmonised from municipal statistical yearbooks (시군 통계연보) published annually by county governments.

Farming area table from Namhae-gun statistical yearbook, 1969
Farming area by township, Namhae-gun (1969). Mixed Hangul/Hanja headings — a typical source page.

MIRACLE starts with South Korea's municipal statistical yearbooks, but the ambition extends in two directions. First, within Korea, we plan to incorporate additional administrative sources — expressway construction logs, agricultural extension records, Korea Forest Service archives, colonial-era household registries, and local personnel files — to deepen the panel and enable research designs that link infrastructure, agricultural modernisation, and environmental policy to local institutional conditions.

Second, across countries, the infrastructure we build is designed to accommodate other growth miracle economies with comparable subnational statistical traditions. If similar municipal records exist for Taiwan, or district-level yearbooks for post-war Japan, they belong in the same framework. The goal is a comparative subnational data platform for studying rapid development wherever it has occurred.

Source material

Municipal statistical yearbooks (시군 통계연보), published annually by county and city governments. Volumes are dispersed across provincial archives, university libraries, and government collections — never systematically compiled or digitised.

Population table from Kosung-gun statistical yearbook, 1975
Population by township, Kosung-gun (1975). Households and population counts.

Pipeline

From scanned page to analysis-ready panel in four steps:

01

AI-OCR for mixed scripts Current focus

Custom pipeline fine-tuned for mixed Hangul/Hanja archival tables. 87% pilot accuracy, targeting 92–95%. This is what makes the project feasible — these documents were previously unusable at scale.

Structured output — 경지면적현황, 남해군 (1969)
읍면합계논 (답)밭 (전)
소계1모작2모작
남해8,0545,8491,2304,6192,205✓ balanced
이동10,9937,5441,1756,3693,449✓ balanced
삼동12,7857,3491,2396,1105,436✓ balanced
남면11,4706,0128575,1555,458✓ balanced
고현8,3105,6807434,9372,630✓ balanced
창선13,1737,9012,3115,5905,272✓ balanced
All township names correct. Nested headers preserved. Row-level cross-validation passed.
Source: 경지면적현황, 남해군 통계연보 (1969) — mixed Hangul/Hanja table with vertical headers
1農 業 ~22← Page number confusion
22 경 지 면 적 현 황← Vertical text → individual chars
3(단위 :단보)
4구분 합 게 등 게 답 1포작 2포작 전 미 합게← Nested headers flattened
5면별 8,054 5,849 1,230 4,619 2,205 444← Row-column mapping unclear
6남 해 10,993 7,544 1,175 6,369 3,449 890← Numbers may be misaligned
7설 동 11,470 6,012 857 5,155 5,458 841← '삼동' → '설동' misrecognised
8남 9,553 5,891 1,101 4,790 3,662 522← Township name truncated
9저 현 9,564 9,705 550 6,145 2,859 955← '고현' split across lines
10창 13,173 7,901 2,311 5,590 5,272← '창선' → '창' only
⚠ Vertical headers completely failed. Table structure unrecoverable.
Layout parsing failure
Nested headers flattened — column-data mapping lost
Vertical text failure
Vertical Korean split into individual characters
Cell mapping errors
Numbers detached from columns
Same source — context-aware layout parsing + structured output
Step 1: Layout
Step 2: Context OCR
Step 3: Structure
Step 4: Validate
Table regions, header hierarchy, vertical text
'경지면적' context corrects '설동'→'삼동'
Nested headers → hierarchical CSV
Row totals = column totals; cross-ref

See structured output table above.

02

Variable harmonisation Current focus

Definitions, units, and table structures changed across editions and municipalities. We build crosswalks reconciling these into consistent time series.

03

Boundary concordances Pilot complete

Two major reorganisations (1963, 1973) plus dozens of smaller changes. We construct time-consistent miracle_id identifiers.

04

Geocoding & GIS Pilot complete

Every township linked to satellite, elevation, slope, soil, and transport network data. 196 Namhae-gun villages fully geocoded.


Output

The dataset is organised into modules by domain, each a flat township-year panel. Merge across modules using Core Keys. CSV & Stata formats, with full codebook and variable documentation.

miracle_idyearprovmunitwppophhpaddy_haschoolsroad_km
KR-48-840-0101970경남남해군남해읍28,4125,6801,245723.4
KR-48-840-0101975경남남해군남해읍25,8915,3201,198831.7
KR-48-840-0101980경남남해군남해읍22,1055,0101,152838.2
KR-47-720-0301970경북영주시풍기읍31,5506,1401,870918.6
Illustrative example — pilot data release late 2026.
ModuleDescriptionETA
Core Keys
miracle_id · province · municipality · township · concordances
Geographic identifiers and boundary concordances across the 1963/1973 reorganisations.2026
Demographics
population · households · age structure
Population counts, household numbers, demographic composition.2026
Agriculture
paddy area · crop output · livestock
Cultivated area, output (harmonised to metric units), livestock.2026
Industry
establishments · employment · output
Industrial establishments, manufacturing employment, sectoral output.2027
Infrastructure
roads · electricity · water · telecom
Road length, electrification, public utilities.2027
Public Finance
revenues · expenditures · transfers
Municipal revenue/expenditure, central transfers, fiscal capacity.2027
Education
schools · enrolment · teachers
School counts, enrolment, teachers, educational infrastructure.2027
Geospatial
shapefiles · centroids · boundaries
GIS boundary files with consistent township geometries.2027
Institutions
clan concentration · bureaucratic capacity
Pre-treatment institutional measures from 1930 registries and personnel files.2028
Pilot release: late 2026. Gyeongbu Expressway corridor (~400 townships). Core Keys, Demographics, and Agriculture modules. CSV & Stata formats. Request early access.
📊
Public data explorer — interactive dashboard for browsing county-level data, in development. Preview →
Research using MIRACLE

MIRACLE enables research designs that were previously impossible — including the first causal analysis of the Gyeongbu Expressway, the Saemaul Undong, Korea's high-yield rice revolution, and one of history's largest reforestation programmes.

Seol (2026, R&R at Journal of Political Economy) — the Saemaul Undong paper — is built entirely on MIRACLE data.

Other applications. Geography of industrialisation, education expansion, fiscal transfers, land reform, environmental policy, developmental states. Using MIRACLE data? Let us know.

Seol, BooKang (2026). "The Saemaul Undong and Rural Development in South Korea." R&R at Journal of Political Economy.

Related projects:

Using MIRACLE data in your research? Let us know — we'd like to hear about it.

Geographic coverage

Digitisation proceeds province by province, constrained by the uneven survival of physical yearbooks across Korea's provincial archives. Hover over each province for details on coverage, year range, and scanning status.

경기 강원 충북 충남 전북 전남 경북 경남 제주 서울 부산 남해군 pilot 196 villages geocoded Hover for details · Based on administrative boundaries
Pilot complete Digitising Sources located Planned

Last updated March 2026

🗺️
Coverage map — township-level digitisation progress across 191 municipalities, 1956–1985. Explore →

Timeline
2023–24
Done
Source identification. AI-OCR pipeline development. 196 villages geocoded in Namhae-gun. Partnerships with KDI and Sogang.
2025
Done
Systematic digitisation. OCR fine-tuning. Variable harmonisation. GIS boundary reconciliation.
2026
Active
Pilot release: Gyeongbu Expressway corridor townships. Core Keys and initial domain modules.
2027–28
Planned
Full national coverage. Additional archival sources. Expansion to other growth miracle economies.
Team
BSPhoto
Principal Investigator

BooKang Seol

설북강
Postdoctoral Researcher, LSE
bookangseol.com
Photo
Co-Investigator

Changkeun Lee

이창근
Korea Development Institute (KDI)
Photo
Co-Investigator

Hyunjoo Yang

양현주
Dept. of Economics, Sogang University

Hiring research assistants for 2026–27. Get in touch.

Research Assistant

TBD

To be recruited
OCR pipeline & quality validation
Research Assistant

TBD

To be recruited
GIS & geocoding
Research Assistant

TBD

To be recruited
Variable harmonisation

Partners

KDI
KDI
Korea Development Institute
LSE
LSE
London School of Economics
Sogang
Sogang
Sogang University
STEG
STEG
Structural Transformation & Economic Growth

For early access, collaboration, or questions — [enable JavaScript]

Seol, BooKang, Changkeun Lee, and Hyunjoo Yang. "MIRACLE: Subnational Economic Data for South Korea's Developmental Period, 1960–1989." London School of Economics, 2026. @techreport{seol2026miracle, author = {Seol, BooKang and Lee, Changkeun and Yang, Hyunjoo}, title = {{MIRACLE}: Subnational Economic Data for South Korea's Developmental Period, 1960--1989}, institution = {London School of Economics}, year = {2026} }