Overview

Dataset statistics

Number of variables4
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows871
Duplicate rows (%)8.7%
Total size in memory419.9 KiB
Average record size in memory43.0 B

Variable types

Numeric3
Text1

Dataset

Description처리일자,청소년유해업소업종코드,청소년유해업소업종명,건수
Author서초구
URLhttps://data.seoul.go.kr/dataList/OA-11040/S/1/datasetView.do

Alerts

Dataset has 871 (8.7%) duplicate rowsDuplicates

Reproduction

Analysis started2024-05-11 16:19:39.425279
Analysis finished2024-05-11 16:19:44.885397
Duration5.46 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

처리일자
Real number (ℝ)

Distinct30
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20240448
Minimum20240411
Maximum20240510
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-12T01:19:45.088277image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum20240411
5-th percentile20240412
Q120240418
median20240425
Q320240503
95-th percentile20240509
Maximum20240510
Range99
Interquartile range (IQR)85

Descriptive statistics

Standard deviation40.178781
Coefficient of variation (CV)1.9850737 × 10-6
Kurtosis-1.4243635
Mean20240448
Median Absolute Deviation (MAD)9
Skewness0.71164045
Sum2.0240448 × 1011
Variance1614.3344
MonotonicityNot monotonic
2024-05-12T01:19:45.482887image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=30)
ValueCountFrequency (%)
20240510 365
 
3.6%
20240423 356
 
3.6%
20240430 353
 
3.5%
20240412 352
 
3.5%
20240414 351
 
3.5%
20240419 345
 
3.5%
20240411 345
 
3.5%
20240427 344
 
3.4%
20240417 344
 
3.4%
20240416 342
 
3.4%
Other values (20) 6503
65.0%
ValueCountFrequency (%)
20240411 345
3.5%
20240412 352
3.5%
20240413 337
3.4%
20240414 351
3.5%
20240415 314
3.1%
20240416 342
3.4%
20240417 344
3.4%
20240418 338
3.4%
20240419 345
3.5%
20240420 340
3.4%
ValueCountFrequency (%)
20240510 365
3.6%
20240509 329
3.3%
20240508 329
3.3%
20240507 321
3.2%
20240506 326
3.3%
20240505 325
3.2%
20240504 316
3.2%
20240503 284
2.8%
20240502 329
3.3%
20240501 319
3.2%
Distinct59
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12935.793
Minimum10101
Maximum30111
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-12T01:19:45.881524image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum10101
5-th percentile10103
Q110113
median10210
Q310413
95-th percentile24201
Maximum30111
Range20010
Interquartile range (IQR)300

Descriptive statistics

Standard deviation5262.2612
Coefficient of variation (CV)0.4067985
Kurtosis1.34035
Mean12935.793
Median Absolute Deviation (MAD)107
Skewness1.6443408
Sum1.2935793 × 108
Variance27691393
MonotonicityNot monotonic
2024-05-12T01:19:46.331699image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10113 249
 
2.5%
20301 245
 
2.5%
10116 240
 
2.4%
24205 239
 
2.4%
10104 237
 
2.4%
20105 237
 
2.4%
10102 236
 
2.4%
10106 236
 
2.4%
20101 235
 
2.4%
10208 233
 
2.3%
Other values (49) 7613
76.1%
ValueCountFrequency (%)
10101 222
2.2%
10102 236
2.4%
10103 228
2.3%
10104 237
2.4%
10105 222
2.2%
10106 236
2.4%
10107 216
2.2%
10108 113
1.1%
10109 18
 
0.2%
10110 193
1.9%
ValueCountFrequency (%)
30111 142
1.4%
30110 63
 
0.6%
24205 239
2.4%
24201 184
1.8%
24113 198
2.0%
24101 95
 
0.9%
20399 145
1.5%
20301 245
2.5%
20199 93
 
0.9%
20107 178
1.8%
Distinct54
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-12T01:19:47.098747image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length9
Mean length4.2192
Min length2

Characters and Unicode

Total characters42192
Distinct characters128
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row비어(바)살롱
2nd row중국식
3rd row편의점
4th row호프(소주방)
5th row복어취급
ValueCountFrequency (%)
기타 841
 
8.2%
패스트푸드 438
 
4.3%
전통찻집 317
 
3.1%
관광호텔 287
 
2.8%
통닭(치킨 249
 
2.4%
일반이용업 245
 
2.4%
생선회 240
 
2.3%
노래연습장업 239
 
2.3%
여관업 237
 
2.3%
일식 237
 
2.3%
Other values (44) 6908
67.5%
2024-05-12T01:19:48.197905image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1819
 
4.3%
1612
 
3.8%
( 1207
 
2.9%
) 1207
 
2.9%
1082
 
2.6%
1016
 
2.4%
845
 
2.0%
841
 
2.0%
841
 
2.0%
750
 
1.8%
Other values (118) 30972
73.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 39324
93.2%
Open Punctuation 1207
 
2.9%
Close Punctuation 1207
 
2.9%
Space Separator 238
 
0.6%
Other Punctuation 216
 
0.5%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1819
 
4.6%
1612
 
4.1%
1082
 
2.8%
1016
 
2.6%
845
 
2.1%
841
 
2.1%
841
 
2.1%
750
 
1.9%
749
 
1.9%
742
 
1.9%
Other values (114) 29027
73.8%
Open Punctuation
ValueCountFrequency (%)
( 1207
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1207
100.0%
Space Separator
ValueCountFrequency (%)
238
100.0%
Other Punctuation
ValueCountFrequency (%)
, 216
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 39324
93.2%
Common 2868
 
6.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1819
 
4.6%
1612
 
4.1%
1082
 
2.8%
1016
 
2.6%
845
 
2.1%
841
 
2.1%
841
 
2.1%
750
 
1.9%
749
 
1.9%
742
 
1.9%
Other values (114) 29027
73.8%
Common
ValueCountFrequency (%)
( 1207
42.1%
) 1207
42.1%
238
 
8.3%
, 216
 
7.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 39324
93.2%
ASCII 2868
 
6.8%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1819
 
4.6%
1612
 
4.1%
1082
 
2.8%
1016
 
2.6%
845
 
2.1%
841
 
2.1%
841
 
2.1%
750
 
1.9%
749
 
1.9%
742
 
1.9%
Other values (114) 29027
73.8%
ASCII
ValueCountFrequency (%)
( 1207
42.1%
) 1207
42.1%
238
 
8.3%
, 216
 
7.5%

건수
Real number (ℝ)

Distinct724
Distinct (%)7.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean155.342
Minimum1
Maximum4926
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-12T01:19:48.436697image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q14
median26
Q3132
95-th percentile605
Maximum4926
Range4925
Interquartile range (IQR)128

Descriptive statistics

Standard deviation379.77098
Coefficient of variation (CV)2.4447411
Kurtosis35.616254
Mean155.342
Median Absolute Deviation (MAD)25
Skewness5.2656112
Sum1553420
Variance144225.99
MonotonicityNot monotonic
2024-05-12T01:19:48.846088image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1089
 
10.9%
2 689
 
6.9%
4 452
 
4.5%
3 381
 
3.8%
5 343
 
3.4%
7 201
 
2.0%
8 187
 
1.9%
6 160
 
1.6%
14 152
 
1.5%
11 144
 
1.4%
Other values (714) 6202
62.0%
ValueCountFrequency (%)
1 1089
10.9%
2 689
6.9%
3 381
 
3.8%
4 452
4.5%
5 343
 
3.4%
6 160
 
1.6%
7 201
 
2.0%
8 187
 
1.9%
9 139
 
1.4%
10 101
 
1.0%
ValueCountFrequency (%)
4926 2
< 0.1%
4925 1
 
< 0.1%
4922 1
 
< 0.1%
4921 1
 
< 0.1%
3221 1
 
< 0.1%
3220 2
< 0.1%
3219 4
< 0.1%
3136 2
< 0.1%
3134 2
< 0.1%
3132 1
 
< 0.1%

Interactions

2024-05-12T01:19:43.593859image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T01:19:41.962061image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T01:19:42.759490image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T01:19:43.849678image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T01:19:42.223594image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T01:19:43.036012image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T01:19:44.125549image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T01:19:42.504006image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T01:19:43.322564image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-12T01:19:49.007661image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
처리일자청소년유해업소업종코드청소년유해업소업종명건수
처리일자1.0000.0000.0000.000
청소년유해업소업종코드0.0001.0000.9990.134
청소년유해업소업종명0.0000.9991.0000.736
건수0.0000.1340.7361.000
2024-05-12T01:19:49.163602image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
처리일자청소년유해업소업종코드건수
처리일자1.000-0.002-0.005
청소년유해업소업종코드-0.0021.000-0.265
건수-0.005-0.2651.000

Missing values

2024-05-12T01:19:44.455171image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-12T01:19:44.743706image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

처리일자청소년유해업소업종코드청소년유해업소업종명건수
275152024050610207비어(바)살롱4
198512024042910102중국식143
162342024042510410편의점218
185692024042710112호프(소주방)624
117402024042110114복어취급2
270792024050510408철도역구내1
79412024041810301단란주점45
65682024041610411패스트푸드104
61862024041610201카바레8
243942024050330111무도학원업3
처리일자청소년유해업소업종코드청소년유해업소업종명건수
230602024050110499기타617
302442024050810210간이주점1
63652024041624101게임제공업3
117002024042110116생선회33
24532024041310407유원지3
259482024050410111패스트푸드27
279532024050610116생선회3
16772024041210410편의점275
211992024043010199기타493
329472024051010201카바레2

Duplicate rows

Most frequently occurring

처리일자청소년유해업소업종코드청소년유해업소업종명건수# duplicates
7712024050710413전통찻집17
72024041110202고고(디스코)클럽16
1592024041524201비디오물감상실업16
1502024041510409백화점15
2112024041710409백화점15
2572024041910201카바레25
4562024042610210간이주점15
6662024050310413전통찻집15
6822024050410201카바레25
6942024050420399이용업 기타15