Overview

Dataset statistics

Number of variables4
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows880
Duplicate rows (%)8.8%
Total size in memory419.9 KiB
Average record size in memory43.0 B

Variable types

Numeric3
Text1

Dataset

Description처리일자,청소년유해업소업종코드,청소년유해업소업종명,건수
Author강북구
URLhttps://data.seoul.go.kr/dataList/OA-10886/S/1/datasetView.do

Alerts

Dataset has 880 (8.8%) duplicate rowsDuplicates

Reproduction

Analysis started2024-05-11 15:55:05.554468
Analysis finished2024-05-11 15:55:11.040595
Duration5.49 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

처리일자
Real number (ℝ)

Distinct30
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20240449
Minimum20240411
Maximum20240510
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-12T00:55:11.245024image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum20240411
5-th percentile20240412
Q120240418
median20240426
Q320240503
95-th percentile20240509
Maximum20240510
Range99
Interquartile range (IQR)85

Descriptive statistics

Standard deviation40.343877
Coefficient of variation (CV)1.9932304 × 10-6
Kurtosis-1.481708
Mean20240449
Median Absolute Deviation (MAD)11
Skewness0.67176885
Sum2.0240449 × 1011
Variance1627.6284
MonotonicityNot monotonic
2024-05-12T00:55:11.641902image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=30)
ValueCountFrequency (%)
20240507 369
 
3.7%
20240415 365
 
3.6%
20240423 356
 
3.6%
20240501 352
 
3.5%
20240428 342
 
3.4%
20240430 341
 
3.4%
20240420 341
 
3.4%
20240414 340
 
3.4%
20240502 340
 
3.4%
20240508 337
 
3.4%
Other values (20) 6517
65.2%
ValueCountFrequency (%)
20240411 329
3.3%
20240412 332
3.3%
20240413 337
3.4%
20240414 340
3.4%
20240415 365
3.6%
20240416 321
3.2%
20240417 336
3.4%
20240418 328
3.3%
20240419 323
3.2%
20240420 341
3.4%
ValueCountFrequency (%)
20240510 318
3.2%
20240509 300
3.0%
20240508 337
3.4%
20240507 369
3.7%
20240506 336
3.4%
20240505 313
3.1%
20240504 331
3.3%
20240503 332
3.3%
20240502 340
3.4%
20240501 352
3.5%
Distinct59
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12928.039
Minimum10101
Maximum30111
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-12T00:55:12.046192image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum10101
5-th percentile10103
Q110113
median10208
Q310413
95-th percentile24201
Maximum30111
Range20010
Interquartile range (IQR)300

Descriptive statistics

Standard deviation5280.6722
Coefficient of variation (CV)0.40846662
Kurtosis1.4400217
Mean12928.039
Median Absolute Deviation (MAD)105
Skewness1.6678893
Sum1.2928038 × 108
Variance27885499
MonotonicityNot monotonic
2024-05-12T00:55:12.500529image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10411 254
 
2.5%
20101 246
 
2.5%
10118 242
 
2.4%
10102 241
 
2.4%
20105 241
 
2.4%
10117 238
 
2.4%
10208 238
 
2.4%
10111 237
 
2.4%
10113 235
 
2.4%
10101 234
 
2.3%
Other values (49) 7594
75.9%
ValueCountFrequency (%)
10101 234
2.3%
10102 241
2.4%
10103 227
2.3%
10104 226
2.3%
10105 220
2.2%
10106 219
2.2%
10107 222
2.2%
10108 124
1.2%
10109 10
 
0.1%
10110 196
2.0%
ValueCountFrequency (%)
30111 146
1.5%
30110 77
 
0.8%
24205 220
2.2%
24201 178
1.8%
24113 223
2.2%
24101 73
 
0.7%
20399 137
1.4%
20301 197
2.0%
20199 100
1.0%
20107 195
1.9%
Distinct54
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-12T00:55:13.292370image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length9
Mean length4.2362
Min length2

Characters and Unicode

Total characters42362
Distinct characters128
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row게임제공업
2nd row호프(소주방)
3rd row기타
4th row노래연습장업
5th row까페
ValueCountFrequency (%)
기타 867
 
8.5%
패스트푸드 491
 
4.8%
전통찻집 318
 
3.1%
관광호텔 289
 
2.8%
식육취급 242
 
2.4%
여관업 241
 
2.4%
중국식 241
 
2.4%
까페 238
 
2.3%
룸살롱 238
 
2.3%
통닭(치킨 235
 
2.3%
Other values (44) 6837
66.8%
2024-05-12T00:55:14.462459image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1787
 
4.2%
1626
 
3.8%
) 1218
 
2.9%
( 1218
 
2.9%
1049
 
2.5%
964
 
2.3%
867
 
2.0%
867
 
2.0%
823
 
1.9%
818
 
1.9%
Other values (118) 31125
73.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 39467
93.2%
Close Punctuation 1218
 
2.9%
Open Punctuation 1218
 
2.9%
Space Separator 237
 
0.6%
Other Punctuation 222
 
0.5%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1787
 
4.5%
1626
 
4.1%
1049
 
2.7%
964
 
2.4%
867
 
2.2%
867
 
2.2%
823
 
2.1%
818
 
2.1%
783
 
2.0%
762
 
1.9%
Other values (114) 29121
73.8%
Close Punctuation
ValueCountFrequency (%)
) 1218
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1218
100.0%
Space Separator
ValueCountFrequency (%)
237
100.0%
Other Punctuation
ValueCountFrequency (%)
, 222
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 39467
93.2%
Common 2895
 
6.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1787
 
4.5%
1626
 
4.1%
1049
 
2.7%
964
 
2.4%
867
 
2.2%
867
 
2.2%
823
 
2.1%
818
 
2.1%
783
 
2.0%
762
 
1.9%
Other values (114) 29121
73.8%
Common
ValueCountFrequency (%)
) 1218
42.1%
( 1218
42.1%
237
 
8.2%
, 222
 
7.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 39467
93.2%
ASCII 2895
 
6.8%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1787
 
4.5%
1626
 
4.1%
1049
 
2.7%
964
 
2.4%
867
 
2.2%
867
 
2.2%
823
 
2.1%
818
 
2.1%
783
 
2.0%
762
 
1.9%
Other values (114) 29121
73.8%
ASCII
ValueCountFrequency (%)
) 1218
42.1%
( 1218
42.1%
237
 
8.2%
, 222
 
7.7%

건수
Real number (ℝ)

Distinct730
Distinct (%)7.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean157.6852
Minimum1
Maximum4930
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-12T00:55:14.870984image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q14
median24
Q3134
95-th percentile617
Maximum4930
Range4929
Interquartile range (IQR)130

Descriptive statistics

Standard deviation392.15118
Coefficient of variation (CV)2.4869244
Kurtosis43.062134
Mean157.6852
Median Absolute Deviation (MAD)23
Skewness5.6316444
Sum1576852
Variance153782.55
MonotonicityNot monotonic
2024-05-12T00:55:15.464568image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1066
 
10.7%
2 729
 
7.3%
4 543
 
5.4%
3 352
 
3.5%
5 336
 
3.4%
8 209
 
2.1%
7 208
 
2.1%
6 173
 
1.7%
11 150
 
1.5%
9 142
 
1.4%
Other values (720) 6092
60.9%
ValueCountFrequency (%)
1 1066
10.7%
2 729
7.3%
3 352
 
3.5%
4 543
5.4%
5 336
 
3.4%
6 173
 
1.7%
7 208
 
2.1%
8 209
 
2.1%
9 142
 
1.4%
10 81
 
0.8%
ValueCountFrequency (%)
4930 1
 
< 0.1%
4929 1
 
< 0.1%
4926 5
0.1%
4924 1
 
< 0.1%
4922 2
 
< 0.1%
4921 1
 
< 0.1%
4917 1
 
< 0.1%
3228 1
 
< 0.1%
3227 1
 
< 0.1%
3221 2
 
< 0.1%

Interactions

2024-05-12T00:55:09.743184image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T00:55:08.124686image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T00:55:08.917347image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T00:55:09.997632image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T00:55:08.384768image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T00:55:09.186445image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T00:55:10.275882image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T00:55:08.663635image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T00:55:09.473133image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-12T00:55:15.726155image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
처리일자청소년유해업소업종코드청소년유해업소업종명건수
처리일자1.0000.0250.0000.000
청소년유해업소업종코드0.0251.0000.9990.135
청소년유해업소업종명0.0000.9991.0000.738
건수0.0000.1350.7381.000
2024-05-12T00:55:15.976433image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
처리일자청소년유해업소업종코드건수
처리일자1.0000.007-0.004
청소년유해업소업종코드0.0071.000-0.276
건수-0.004-0.2761.000

Missing values

2024-05-12T00:55:10.607405image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-12T00:55:10.900462image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

처리일자청소년유해업소업종코드청소년유해업소업종명건수
261612024050424101게임제공업11
55822024041610112호프(소주방)339
177052024042710299기타9
294562024050724205노래연습장업322
46692024041510117까페39
118242024042110408철도역구내1
40182024041410202고고(디스코)클럽4
115942024042110117까페26
62392024041610301단란주점76
79512024041820101관광호텔2
처리일자청소년유해업소업종코드청소년유해업소업종명건수
146182024042420105여관업17
240642024050210301단란주점92
172732024042620107여인숙업27
3232024041110111패스트푸드20
246092024050310101한식3081
87752024041810412커피숍1465
326712024051010116생선회33
184842024042710111패스트푸드16
126332024042220105여관업10
314842024050910101한식4926

Duplicate rows

Most frequently occurring

처리일자청소년유해업소업종코드청소년유해업소업종명건수# duplicates
2962024042024201비디오물감상실업16
4072024042424201비디오물감상실업16
6412024050210413전통찻집16
8012024050724201비디오물감상실업16
2942024042020399이용업 기타15
4002024042410409백화점15
5342024042910201카바레25
6322024050210201카바레25
8422024050910201카바레25
232024041120102일반호텔14