Overview

Dataset statistics

Number of variables4
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows913
Duplicate rows (%)9.1%
Total size in memory419.9 KiB
Average record size in memory43.0 B

Variable types

Numeric3
Text1

Dataset

Description처리일자,청소년유해업소업종코드,청소년유해업소업종명,건수
Author은평구
URLhttps://data.seoul.go.kr/dataList/OA-10501/S/1/datasetView.do

Alerts

Dataset has 913 (9.1%) duplicate rowsDuplicates

Reproduction

Analysis started2024-05-11 15:49:49.630883
Analysis finished2024-05-11 15:49:55.116596
Duration5.49 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

처리일자
Real number (ℝ)

Distinct30
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20240448
Minimum20240411
Maximum20240510
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-12T00:49:55.321306image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum20240411
5-th percentile20240412
Q120240418
median20240425
Q320240503
95-th percentile20240509
Maximum20240510
Range99
Interquartile range (IQR)85

Descriptive statistics

Standard deviation40.188644
Coefficient of variation (CV)1.985561 × 10-6
Kurtosis-1.4369123
Mean20240448
Median Absolute Deviation (MAD)9
Skewness0.70248489
Sum2.0240448 × 1011
Variance1615.1271
MonotonicityNot monotonic
2024-05-12T00:49:55.718897image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=30)
ValueCountFrequency (%)
20240430 371
 
3.7%
20240427 352
 
3.5%
20240420 351
 
3.5%
20240412 351
 
3.5%
20240510 350
 
3.5%
20240422 349
 
3.5%
20240411 346
 
3.5%
20240424 344
 
3.4%
20240429 342
 
3.4%
20240417 341
 
3.4%
Other values (20) 6503
65.0%
ValueCountFrequency (%)
20240411 346
3.5%
20240412 351
3.5%
20240413 329
3.3%
20240414 328
3.3%
20240415 324
3.2%
20240416 330
3.3%
20240417 341
3.4%
20240418 339
3.4%
20240419 324
3.2%
20240420 351
3.5%
ValueCountFrequency (%)
20240510 350
3.5%
20240509 341
3.4%
20240508 332
3.3%
20240507 321
3.2%
20240506 324
3.2%
20240505 323
3.2%
20240504 338
3.4%
20240503 294
2.9%
20240502 330
3.3%
20240501 308
3.1%
Distinct59
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12872.816
Minimum10101
Maximum30111
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-12T00:49:56.121018image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum10101
5-th percentile10103
Q110113
median10208
Q310413
95-th percentile24201
Maximum30111
Range20010
Interquartile range (IQR)300

Descriptive statistics

Standard deviation5201.2646
Coefficient of variation (CV)0.40405027
Kurtosis1.4141705
Mean12872.816
Median Absolute Deviation (MAD)105
Skewness1.6679287
Sum1.2872816 × 108
Variance27053154
MonotonicityNot monotonic
2024-05-12T00:49:56.576539image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10101 243
 
2.4%
10117 241
 
2.4%
10301 240
 
2.4%
10104 237
 
2.4%
10116 236
 
2.4%
10118 234
 
2.3%
20105 233
 
2.3%
10403 232
 
2.3%
10105 231
 
2.3%
10107 231
 
2.3%
Other values (49) 7642
76.4%
ValueCountFrequency (%)
10101 243
2.4%
10102 205
2.1%
10103 229
2.3%
10104 237
2.4%
10105 231
2.3%
10106 228
2.3%
10107 231
2.3%
10108 127
1.3%
10109 14
 
0.1%
10110 194
1.9%
ValueCountFrequency (%)
30111 123
1.2%
30110 66
 
0.7%
24205 223
2.2%
24201 199
2.0%
24113 219
2.2%
24101 68
 
0.7%
20399 136
1.4%
20301 223
2.2%
20199 97
1.0%
20107 195
1.9%
Distinct54
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-12T00:49:57.340128image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length9
Mean length4.2418
Min length2

Characters and Unicode

Total characters42418
Distinct characters128
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row철도역구내
2nd row관광호텔
3rd row탕류
4th row간이주점
5th row다방
ValueCountFrequency (%)
기타 824
 
8.1%
패스트푸드 442
 
4.3%
전통찻집 328
 
3.2%
관광호텔 270
 
2.6%
한식 243
 
2.4%
까페 241
 
2.4%
단란주점 240
 
2.3%
일식 237
 
2.3%
생선회 236
 
2.3%
식육취급 234
 
2.3%
Other values (44) 6938
67.8%
2024-05-12T00:49:58.472760image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1782
 
4.2%
1621
 
3.8%
( 1228
 
2.9%
) 1228
 
2.9%
1097
 
2.6%
988
 
2.3%
860
 
2.0%
824
 
1.9%
824
 
1.9%
794
 
1.9%
Other values (118) 31172
73.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 39498
93.1%
Open Punctuation 1228
 
2.9%
Close Punctuation 1228
 
2.9%
Space Separator 233
 
0.5%
Other Punctuation 231
 
0.5%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1782
 
4.5%
1621
 
4.1%
1097
 
2.8%
988
 
2.5%
860
 
2.2%
824
 
2.1%
824
 
2.1%
794
 
2.0%
790
 
2.0%
747
 
1.9%
Other values (114) 29171
73.9%
Open Punctuation
ValueCountFrequency (%)
( 1228
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1228
100.0%
Space Separator
ValueCountFrequency (%)
233
100.0%
Other Punctuation
ValueCountFrequency (%)
, 231
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 39498
93.1%
Common 2920
 
6.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1782
 
4.5%
1621
 
4.1%
1097
 
2.8%
988
 
2.5%
860
 
2.2%
824
 
2.1%
824
 
2.1%
794
 
2.0%
790
 
2.0%
747
 
1.9%
Other values (114) 29171
73.9%
Common
ValueCountFrequency (%)
( 1228
42.1%
) 1228
42.1%
233
 
8.0%
, 231
 
7.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 39498
93.1%
ASCII 2920
 
6.9%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1782
 
4.5%
1621
 
4.1%
1097
 
2.8%
988
 
2.5%
860
 
2.2%
824
 
2.1%
824
 
2.1%
794
 
2.0%
790
 
2.0%
747
 
1.9%
Other values (114) 29171
73.9%
ASCII
ValueCountFrequency (%)
( 1228
42.1%
) 1228
42.1%
233
 
8.0%
, 231
 
7.9%

건수
Real number (ℝ)

Distinct742
Distinct (%)7.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean162.0118
Minimum1
Maximum4930
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-12T00:49:58.881780image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q14
median24
Q3134
95-th percentile630
Maximum4930
Range4929
Interquartile range (IQR)130

Descriptive statistics

Standard deviation404.82889
Coefficient of variation (CV)2.4987617
Kurtosis39.4036
Mean162.0118
Median Absolute Deviation (MAD)23
Skewness5.4740046
Sum1620118
Variance163886.43
MonotonicityNot monotonic
2024-05-12T00:49:59.477328image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1128
 
11.3%
2 726
 
7.3%
4 479
 
4.8%
3 353
 
3.5%
5 341
 
3.4%
8 194
 
1.9%
7 193
 
1.9%
6 185
 
1.8%
9 157
 
1.6%
14 142
 
1.4%
Other values (732) 6102
61.0%
ValueCountFrequency (%)
1 1128
11.3%
2 726
7.3%
3 353
 
3.5%
4 479
4.8%
5 341
 
3.4%
6 185
 
1.8%
7 193
 
1.9%
8 194
 
1.9%
9 157
 
1.6%
10 84
 
0.8%
ValueCountFrequency (%)
4930 1
 
< 0.1%
4926 4
< 0.1%
4925 1
 
< 0.1%
4924 1
 
< 0.1%
4923 2
< 0.1%
4917 1
 
< 0.1%
4916 1
 
< 0.1%
3221 3
< 0.1%
3220 2
< 0.1%
3219 3
< 0.1%

Interactions

2024-05-12T00:49:53.820091image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T00:49:52.201102image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T00:49:52.993294image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T00:49:54.074001image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T00:49:52.460653image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T00:49:53.262002image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T00:49:54.351011image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T00:49:52.739395image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T00:49:53.549289image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-12T00:49:59.740607image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
처리일자청소년유해업소업종코드청소년유해업소업종명건수
처리일자1.0000.0240.0000.011
청소년유해업소업종코드0.0241.0000.9990.141
청소년유해업소업종명0.0000.9991.0000.743
건수0.0110.1410.7431.000
2024-05-12T00:49:59.990980image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
처리일자청소년유해업소업종코드건수
처리일자1.0000.0060.005
청소년유해업소업종코드0.0061.000-0.274
건수0.005-0.2741.000

Missing values

2024-05-12T00:49:54.681703image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-12T00:49:54.976121image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

처리일자청소년유해업소업종코드청소년유해업소업종명건수
227012024042010408철도역구내20
91332024050220101관광호텔4
214512024042110119탕류1
126712024042910210간이주점4
152692024042710402다방10
235012024041910101한식1557
271182024041620102일반호텔8
142302024042810104일식180
155482024042620399이용업 기타1
158692024042610102중국식142
처리일자청소년유해업소업종코드청소년유해업소업종명건수
202902024042224201비디오물감상실업1
170502024042510199기타877
47852024050620107여인숙업21
96802024050210101한식2858
60052024050510201카바레2
324212024041110202고고(디스코)클럽5
93452024050210208룸살롱138
8062024051010403일반조리판매129
21082024050910118식육취급39
9582024051020102일반호텔8

Duplicate rows

Most frequently occurring

처리일자청소년유해업소업종코드청소년유해업소업종명건수# duplicates
2582024041910413전통찻집15
4352024042510201카바레25
4862024042624201비디오물감상실업15
5162024042720399이용업 기타15
5342024042810201카바레25
6442024050120399이용업 기타15
7782024050610413전통찻집15
9082024051020399이용업 기타15
382024041210119탕류44
572024041224201비디오물감상실업14