Overview

Dataset statistics

Number of variables3
Number of observations2742
Missing cells1203
Missing cells (%)14.6%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory67.1 KiB
Average record size in memory25.0 B

Variable types

Numeric1
Text2

Dataset

Description서울특별시 용산구 출판사 현황(연번, 출판사 사업체 명칭, 출판사 사업체 소재지 항목)에 대한 데이터를 제공합니다.
URLhttps://www.data.go.kr/data/15090482/fileData.do

Alerts

사업체소재지 has 1203 (43.9%) missing valuesMissing
연번 has unique valuesUnique

Reproduction

Analysis started2023-12-12 14:48:59.107066
Analysis finished2023-12-12 14:48:59.856281
Duration0.75 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

UNIQUE 

Distinct2742
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1371.5
Minimum1
Maximum2742
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size24.2 KiB
2023-12-12T23:48:59.938999image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile138.05
Q1686.25
median1371.5
Q32056.75
95-th percentile2604.95
Maximum2742
Range2741
Interquartile range (IQR)1370.5

Descriptive statistics

Standard deviation791.69154
Coefficient of variation (CV)0.57724502
Kurtosis-1.2
Mean1371.5
Median Absolute Deviation (MAD)685.5
Skewness0
Sum3760653
Variance626775.5
MonotonicityStrictly increasing
2023-12-12T23:49:00.085819image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
< 0.1%
1833 1
 
< 0.1%
1825 1
 
< 0.1%
1826 1
 
< 0.1%
1827 1
 
< 0.1%
1828 1
 
< 0.1%
1829 1
 
< 0.1%
1830 1
 
< 0.1%
1831 1
 
< 0.1%
1832 1
 
< 0.1%
Other values (2732) 2732
99.6%
ValueCountFrequency (%)
1 1
< 0.1%
2 1
< 0.1%
3 1
< 0.1%
4 1
< 0.1%
5 1
< 0.1%
6 1
< 0.1%
7 1
< 0.1%
8 1
< 0.1%
9 1
< 0.1%
10 1
< 0.1%
ValueCountFrequency (%)
2742 1
< 0.1%
2741 1
< 0.1%
2740 1
< 0.1%
2739 1
< 0.1%
2738 1
< 0.1%
2737 1
< 0.1%
2736 1
< 0.1%
2735 1
< 0.1%
2734 1
< 0.1%
2733 1
< 0.1%
Distinct2675
Distinct (%)97.6%
Missing0
Missing (%)0.0%
Memory size21.6 KiB
2023-12-12T23:49:00.395748image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length67
Median length39
Mean length7.183078
Min length1

Characters and Unicode

Total characters19696
Distinct characters782
Distinct categories11 ?
Distinct scripts4 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2613 ?
Unique (%)95.3%

Sample

1st row극동출판사
2nd row도서출판 한진
3rd row도서출판(주)문진미디
4th row아동문학사
5th row기독교복음침례회
ValueCountFrequency (%)
도서출판 264
 
7.4%
주식회사 121
 
3.4%
books 14
 
0.4%
출판사 13
 
0.4%
사단법인 11
 
0.3%
press 10
 
0.3%
스튜디오 9
 
0.3%
재단법인 7
 
0.2%
미디어 6
 
0.2%
5
 
0.1%
Other values (2933) 3091
87.0%
2023-12-12T23:49:00.946373image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
809
 
4.1%
722
 
3.7%
721
 
3.7%
704
 
3.6%
560
 
2.8%
520
 
2.6%
) 476
 
2.4%
( 470
 
2.4%
464
 
2.4%
346
 
1.8%
Other values (772) 13904
70.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 15737
79.9%
Lowercase Letter 1107
 
5.6%
Uppercase Letter 960
 
4.9%
Space Separator 809
 
4.1%
Close Punctuation 476
 
2.4%
Open Punctuation 470
 
2.4%
Decimal Number 65
 
0.3%
Other Punctuation 57
 
0.3%
Dash Punctuation 11
 
0.1%
Math Symbol 3
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
722
 
4.6%
721
 
4.6%
704
 
4.5%
560
 
3.6%
520
 
3.3%
464
 
2.9%
346
 
2.2%
343
 
2.2%
243
 
1.5%
218
 
1.4%
Other values (698) 10896
69.2%
Uppercase Letter
ValueCountFrequency (%)
E 77
 
8.0%
S 74
 
7.7%
O 74
 
7.7%
A 70
 
7.3%
I 60
 
6.2%
B 55
 
5.7%
N 53
 
5.5%
C 49
 
5.1%
T 45
 
4.7%
D 45
 
4.7%
Other values (16) 358
37.3%
Lowercase Letter
ValueCountFrequency (%)
e 137
12.4%
o 117
10.6%
a 95
 
8.6%
n 91
 
8.2%
i 90
 
8.1%
s 86
 
7.8%
r 78
 
7.0%
t 64
 
5.8%
u 40
 
3.6%
l 37
 
3.3%
Other values (14) 272
24.6%
Decimal Number
ValueCountFrequency (%)
1 20
30.8%
2 19
29.2%
3 5
 
7.7%
9 5
 
7.7%
0 5
 
7.7%
7 4
 
6.2%
5 3
 
4.6%
4 2
 
3.1%
6 2
 
3.1%
Other Punctuation
ValueCountFrequency (%)
. 22
38.6%
& 21
36.8%
, 7
 
12.3%
· 3
 
5.3%
" 2
 
3.5%
: 1
 
1.8%
1
 
1.8%
Math Symbol
ValueCountFrequency (%)
< 1
33.3%
> 1
33.3%
+ 1
33.3%
Space Separator
ValueCountFrequency (%)
809
100.0%
Close Punctuation
ValueCountFrequency (%)
) 476
100.0%
Open Punctuation
ValueCountFrequency (%)
( 470
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 11
100.0%
Other Number
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 15714
79.8%
Latin 2067
 
10.5%
Common 1892
 
9.6%
Han 23
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
722
 
4.6%
721
 
4.6%
704
 
4.5%
560
 
3.6%
520
 
3.3%
464
 
3.0%
346
 
2.2%
343
 
2.2%
243
 
1.5%
218
 
1.4%
Other values (676) 10873
69.2%
Latin
ValueCountFrequency (%)
e 137
 
6.6%
o 117
 
5.7%
a 95
 
4.6%
n 91
 
4.4%
i 90
 
4.4%
s 86
 
4.2%
r 78
 
3.8%
E 77
 
3.7%
S 74
 
3.6%
O 74
 
3.6%
Other values (40) 1148
55.5%
Common
ValueCountFrequency (%)
809
42.8%
) 476
25.2%
( 470
24.8%
. 22
 
1.2%
& 21
 
1.1%
1 20
 
1.1%
2 19
 
1.0%
- 11
 
0.6%
, 7
 
0.4%
3 5
 
0.3%
Other values (14) 32
 
1.7%
Han
ValueCountFrequency (%)
2
 
8.7%
1
 
4.3%
1
 
4.3%
1
 
4.3%
1
 
4.3%
1
 
4.3%
1
 
4.3%
1
 
4.3%
1
 
4.3%
1
 
4.3%
Other values (12) 12
52.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 15711
79.8%
ASCII 3954
 
20.1%
CJK 23
 
0.1%
None 5
 
< 0.1%
Compat Jamo 3
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
809
20.5%
) 476
 
12.0%
( 470
 
11.9%
e 137
 
3.5%
o 117
 
3.0%
a 95
 
2.4%
n 91
 
2.3%
i 90
 
2.3%
s 86
 
2.2%
r 78
 
2.0%
Other values (61) 1505
38.1%
Hangul
ValueCountFrequency (%)
722
 
4.6%
721
 
4.6%
704
 
4.5%
560
 
3.6%
520
 
3.3%
464
 
3.0%
346
 
2.2%
343
 
2.2%
243
 
1.5%
218
 
1.4%
Other values (674) 10870
69.2%
None
ValueCountFrequency (%)
· 3
60.0%
1
 
20.0%
1
 
20.0%
Compat Jamo
ValueCountFrequency (%)
2
66.7%
1
33.3%
CJK
ValueCountFrequency (%)
2
 
8.7%
1
 
4.3%
1
 
4.3%
1
 
4.3%
1
 
4.3%
1
 
4.3%
1
 
4.3%
1
 
4.3%
1
 
4.3%
1
 
4.3%
Other values (12) 12
52.2%

사업체소재지
Text

MISSING 

Distinct1427
Distinct (%)92.7%
Missing1203
Missing (%)43.9%
Memory size21.6 KiB
2023-12-12T23:49:01.430775image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length56
Median length45
Mean length34.667966
Min length22

Characters and Unicode

Total characters53354
Distinct characters356
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1341 ?
Unique (%)87.1%

Sample

1st row서울특별시 용산구 한강대로62길 26 (한강로1가)
2nd row서울특별시 용산구 한강대로62나길 6 (한강로1가)
3rd row서울특별시 용산구 만리재로 178 (서계동)
4th row서울특별시 용산구 청파로73길 89 (서계동)
5th row서울특별시 용산구 효창원로110길 4, 301호 (서계동, 덕성빌딩)
ValueCountFrequency (%)
서울특별시 1539
 
15.4%
용산구 1533
 
15.3%
한남동 176
 
1.8%
한강대로 174
 
1.7%
한강로2가 137
 
1.4%
2층 121
 
1.2%
이태원동 108
 
1.1%
한강로3가 104
 
1.0%
이촌동 95
 
0.9%
원효로1가 94
 
0.9%
Other values (1669) 5919
59.2%
2023-12-12T23:49:02.106048image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
8463
 
15.9%
1 2002
 
3.8%
1996
 
3.7%
1784
 
3.3%
1764
 
3.3%
1757
 
3.3%
, 1708
 
3.2%
2 1585
 
3.0%
1569
 
2.9%
1561
 
2.9%
Other values (346) 29165
54.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 30175
56.6%
Decimal Number 9398
 
17.6%
Space Separator 8463
 
15.9%
Other Punctuation 1709
 
3.2%
Close Punctuation 1544
 
2.9%
Open Punctuation 1544
 
2.9%
Dash Punctuation 367
 
0.7%
Uppercase Letter 129
 
0.2%
Lowercase Letter 24
 
< 0.1%
Math Symbol 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1996
 
6.6%
1784
 
5.9%
1764
 
5.8%
1757
 
5.8%
1569
 
5.2%
1561
 
5.2%
1551
 
5.1%
1542
 
5.1%
1540
 
5.1%
1443
 
4.8%
Other values (299) 13668
45.3%
Uppercase Letter
ValueCountFrequency (%)
B 37
28.7%
A 22
17.1%
D 17
13.2%
C 16
12.4%
T 4
 
3.1%
G 4
 
3.1%
H 3
 
2.3%
J 3
 
2.3%
S 3
 
2.3%
U 2
 
1.6%
Other values (11) 18
14.0%
Decimal Number
ValueCountFrequency (%)
1 2002
21.3%
2 1585
16.9%
0 1208
12.9%
3 1106
11.8%
4 832
8.9%
5 666
 
7.1%
6 586
 
6.2%
7 576
 
6.1%
8 442
 
4.7%
9 395
 
4.2%
Lowercase Letter
ValueCountFrequency (%)
e 8
33.3%
c 6
25.0%
k 3
 
12.5%
l 2
 
8.3%
j 1
 
4.2%
i 1
 
4.2%
t 1
 
4.2%
r 1
 
4.2%
o 1
 
4.2%
Other Punctuation
ValueCountFrequency (%)
, 1708
99.9%
& 1
 
0.1%
Space Separator
ValueCountFrequency (%)
8463
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1544
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1544
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 367
100.0%
Math Symbol
ValueCountFrequency (%)
~ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 30175
56.6%
Common 23026
43.2%
Latin 153
 
0.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1996
 
6.6%
1784
 
5.9%
1764
 
5.8%
1757
 
5.8%
1569
 
5.2%
1561
 
5.2%
1551
 
5.1%
1542
 
5.1%
1540
 
5.1%
1443
 
4.8%
Other values (299) 13668
45.3%
Latin
ValueCountFrequency (%)
B 37
24.2%
A 22
14.4%
D 17
11.1%
C 16
10.5%
e 8
 
5.2%
c 6
 
3.9%
T 4
 
2.6%
G 4
 
2.6%
H 3
 
2.0%
J 3
 
2.0%
Other values (20) 33
21.6%
Common
ValueCountFrequency (%)
8463
36.8%
1 2002
 
8.7%
, 1708
 
7.4%
2 1585
 
6.9%
) 1544
 
6.7%
( 1544
 
6.7%
0 1208
 
5.2%
3 1106
 
4.8%
4 832
 
3.6%
5 666
 
2.9%
Other values (7) 2368
 
10.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 30175
56.6%
ASCII 23179
43.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
8463
36.5%
1 2002
 
8.6%
, 1708
 
7.4%
2 1585
 
6.8%
) 1544
 
6.7%
( 1544
 
6.7%
0 1208
 
5.2%
3 1106
 
4.8%
4 832
 
3.6%
5 666
 
2.9%
Other values (37) 2521
 
10.9%
Hangul
ValueCountFrequency (%)
1996
 
6.6%
1784
 
5.9%
1764
 
5.8%
1757
 
5.8%
1569
 
5.2%
1561
 
5.2%
1551
 
5.1%
1542
 
5.1%
1540
 
5.1%
1443
 
4.8%
Other values (299) 13668
45.3%

Interactions

2023-12-12T23:48:59.616738image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Missing values

2023-12-12T23:48:59.742374image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T23:48:59.816700image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번사업체명칭사업체소재지
01극동출판사<NA>
12도서출판 한진<NA>
23도서출판(주)문진미디<NA>
34아동문학사<NA>
45기독교복음침례회서울특별시 용산구 한강대로62길 26 (한강로1가)
56(주)삼중당<NA>
67도서출판 탐구당서울특별시 용산구 한강대로62나길 6 (한강로1가)
78장문사<NA>
89문호사<NA>
910문호사<NA>
연번사업체명칭사업체소재지
27322733UE STUDIO(유이 스튜디오)서울특별시 용산구 새창로 120-5, 3층 304호 (용문동, 한강타운)
27332734재단법인 지구와사람서울특별시 용산구 회나무로 66, 1층 (이태원동)
27342735Bolognese Press서울특별시 용산구 대사관로11길 57 (한남동)
27352736크리에이아이터서울특별시 용산구 소월로44길 2-2, 2층 (이태원동)
27362737오후의 테이블서울특별시 용산구 청파로49길 37-3, 디테일씨빌딩 1층 (청파동2가)
27372738책공장 이안재서울특별시 용산구 소월로 377, 402호 (한남동, 남산맨숀)
27382739영화사 이심전심서울특별시 용산구 우사단로4길 26-3, 2층 (보광동)
27392740두다서울특별시 용산구 후암로28길 70, 1층 (후암동)
27402741서혜영스튜디오서울특별시 용산구 두텁바위로60길 49, 대원정사 304호 (후암동)
27412742주식회사 아키모스피어서울특별시 용산구 한강대로72길 21-17 (남영동)