Overview

Dataset statistics

Number of variables4
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows887
Duplicate rows (%)8.9%
Total size in memory419.9 KiB
Average record size in memory43.0 B

Variable types

Numeric3
Text1

Dataset

Description처리일자,청소년유해업소업종코드,청소년유해업소업종명,건수
Author구로구
URLhttps://data.seoul.go.kr/dataList/OA-2547/S/1/datasetView.do

Alerts

Dataset has 887 (8.9%) duplicate rowsDuplicates

Reproduction

Analysis started2024-05-11 16:08:12.291211
Analysis finished2024-05-11 16:08:17.734081
Duration5.44 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

처리일자
Real number (ℝ)

Distinct30
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20240449
Minimum20240411
Maximum20240510
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-12T01:08:17.936593image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum20240411
5-th percentile20240412
Q120240418
median20240426
Q320240503
95-th percentile20240509
Maximum20240510
Range99
Interquartile range (IQR)85

Descriptive statistics

Standard deviation40.281452
Coefficient of variation (CV)1.9901462 × 10-6
Kurtosis-1.4657246
Mean20240449
Median Absolute Deviation (MAD)10
Skewness0.68283654
Sum2.0240449 × 1011
Variance1622.5954
MonotonicityNot monotonic
2024-05-12T01:08:18.330242image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=30)
ValueCountFrequency (%)
20240430 373
 
3.7%
20240425 358
 
3.6%
20240413 356
 
3.6%
20240507 351
 
3.5%
20240502 351
 
3.5%
20240509 351
 
3.5%
20240426 349
 
3.5%
20240424 346
 
3.5%
20240510 343
 
3.4%
20240428 342
 
3.4%
Other values (20) 6480
64.8%
ValueCountFrequency (%)
20240411 316
3.2%
20240412 323
3.2%
20240413 356
3.6%
20240414 324
3.2%
20240415 334
3.3%
20240416 325
3.2%
20240417 340
3.4%
20240418 332
3.3%
20240419 330
3.3%
20240420 337
3.4%
ValueCountFrequency (%)
20240510 343
3.4%
20240509 351
3.5%
20240508 320
3.2%
20240507 351
3.5%
20240506 331
3.3%
20240505 319
3.2%
20240504 305
3.0%
20240503 319
3.2%
20240502 351
3.5%
20240501 314
3.1%
Distinct59
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12949.38
Minimum10101
Maximum30111
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-12T01:08:18.728146image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum10101
5-th percentile10103
Q110113
median10209
Q310413
95-th percentile24201
Maximum30111
Range20010
Interquartile range (IQR)300

Descriptive statistics

Standard deviation5285.186
Coefficient of variation (CV)0.408142
Kurtosis1.305003
Mean12949.38
Median Absolute Deviation (MAD)106
Skewness1.6386254
Sum1.294938 × 108
Variance27933191
MonotonicityNot monotonic
2024-05-12T01:08:19.052293image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10112 243
 
2.4%
24205 243
 
2.4%
20101 241
 
2.4%
10115 239
 
2.4%
10119 236
 
2.4%
10199 236
 
2.4%
10117 235
 
2.4%
10101 235
 
2.4%
10301 233
 
2.3%
10402 232
 
2.3%
Other values (49) 7627
76.3%
ValueCountFrequency (%)
10101 235
2.4%
10102 211
2.1%
10103 206
2.1%
10104 223
2.2%
10105 228
2.3%
10106 222
2.2%
10107 221
2.2%
10108 112
1.1%
10109 22
 
0.2%
10110 184
1.8%
ValueCountFrequency (%)
30111 133
1.3%
30110 72
 
0.7%
24205 243
2.4%
24201 195
1.9%
24113 219
2.2%
24101 94
 
0.9%
20399 119
1.2%
20301 205
2.1%
20199 92
 
0.9%
20107 210
2.1%
Distinct54
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-12T01:08:19.734172image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length9
Mean length4.236
Min length2

Characters and Unicode

Total characters42360
Distinct characters128
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row패스트푸드
2nd row전통찻집
3rd row복어취급
4th row한식
5th row까페
ValueCountFrequency (%)
기타 846
 
8.3%
패스트푸드 440
 
4.3%
전통찻집 311
 
3.0%
관광호텔 282
 
2.8%
노래연습장업 243
 
2.4%
호프(소주방 243
 
2.4%
김밥(도시락 239
 
2.3%
탕류 236
 
2.3%
까페 235
 
2.3%
한식 235
 
2.3%
Other values (44) 6901
67.6%
2024-05-12T01:08:20.608807image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1805
 
4.3%
1561
 
3.7%
) 1244
 
2.9%
( 1244
 
2.9%
1058
 
2.5%
991
 
2.3%
846
 
2.0%
846
 
2.0%
835
 
2.0%
791
 
1.9%
Other values (118) 31139
73.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 39440
93.1%
Close Punctuation 1244
 
2.9%
Open Punctuation 1244
 
2.9%
Other Punctuation 221
 
0.5%
Space Separator 211
 
0.5%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1805
 
4.6%
1561
 
4.0%
1058
 
2.7%
991
 
2.5%
846
 
2.1%
846
 
2.1%
835
 
2.1%
791
 
2.0%
756
 
1.9%
753
 
1.9%
Other values (114) 29198
74.0%
Close Punctuation
ValueCountFrequency (%)
) 1244
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1244
100.0%
Other Punctuation
ValueCountFrequency (%)
, 221
100.0%
Space Separator
ValueCountFrequency (%)
211
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 39440
93.1%
Common 2920
 
6.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1805
 
4.6%
1561
 
4.0%
1058
 
2.7%
991
 
2.5%
846
 
2.1%
846
 
2.1%
835
 
2.1%
791
 
2.0%
756
 
1.9%
753
 
1.9%
Other values (114) 29198
74.0%
Common
ValueCountFrequency (%)
) 1244
42.6%
( 1244
42.6%
, 221
 
7.6%
211
 
7.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 39440
93.1%
ASCII 2920
 
6.9%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1805
 
4.6%
1561
 
4.0%
1058
 
2.7%
991
 
2.5%
846
 
2.1%
846
 
2.1%
835
 
2.1%
791
 
2.0%
756
 
1.9%
753
 
1.9%
Other values (114) 29198
74.0%
ASCII
ValueCountFrequency (%)
) 1244
42.6%
( 1244
42.6%
, 221
 
7.6%
211
 
7.2%

건수
Real number (ℝ)

Distinct736
Distinct (%)7.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean157.548
Minimum1
Maximum4930
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-12T01:08:20.847497image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q14
median23
Q3138
95-th percentile612
Maximum4930
Range4929
Interquartile range (IQR)134

Descriptive statistics

Standard deviation391.62407
Coefficient of variation (CV)2.4857445
Kurtosis39.328087
Mean157.548
Median Absolute Deviation (MAD)22
Skewness5.4913886
Sum1575480
Variance153369.41
MonotonicityNot monotonic
2024-05-12T01:08:21.257303image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1117
 
11.2%
2 714
 
7.1%
4 478
 
4.8%
5 362
 
3.6%
3 346
 
3.5%
7 217
 
2.2%
8 202
 
2.0%
6 198
 
2.0%
11 152
 
1.5%
9 143
 
1.4%
Other values (726) 6071
60.7%
ValueCountFrequency (%)
1 1117
11.2%
2 714
7.1%
3 346
 
3.5%
4 478
4.8%
5 362
 
3.6%
6 198
 
2.0%
7 217
 
2.2%
8 202
 
2.0%
9 143
 
1.4%
10 83
 
0.8%
ValueCountFrequency (%)
4930 1
 
< 0.1%
4926 2
 
< 0.1%
4924 1
 
< 0.1%
4923 1
 
< 0.1%
4922 2
 
< 0.1%
4920 1
 
< 0.1%
3227 1
 
< 0.1%
3225 1
 
< 0.1%
3221 5
0.1%
3220 2
 
< 0.1%

Interactions

2024-05-12T01:08:16.446362image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T01:08:14.837962image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T01:08:15.625154image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T01:08:16.699471image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T01:08:15.095845image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T01:08:15.892898image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T01:08:16.974521image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T01:08:15.372479image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-12T01:08:16.177642image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-12T01:08:21.417079image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
처리일자청소년유해업소업종코드청소년유해업소업종명건수
처리일자1.0000.0000.0000.000
청소년유해업소업종코드0.0001.0000.9990.136
청소년유해업소업종명0.0000.9991.0000.738
건수0.0000.1360.7381.000
2024-05-12T01:08:21.666448image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
처리일자청소년유해업소업종코드건수
처리일자1.0000.005-0.000
청소년유해업소업종코드0.0051.000-0.264
건수-0.000-0.2641.000

Missing values

2024-05-12T01:08:17.303954image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-12T01:08:17.594329image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

처리일자청소년유해업소업종코드청소년유해업소업종명건수
325312024051010111패스트푸드27
48342024041510108전통찻집1
92452024041910114복어취급11
259412024050410101한식1541
5082024041110117까페119
232432024050210413전통찻집8
83312024041810210간이주점1
328352024051020301일반이용업112
207422024042910403일반조리판매490
166482024042620105여관업53
처리일자청소년유해업소업종코드청소년유해업소업종명건수
160232024042510408철도역구내18
171832024042610404관광호텔5
18742024041210402다방23
105222024042010103경양식127
285482024050610499기타217
180992024042710102중국식205
220782024050110208룸살롱12
194602024042810101한식1978
292172024050710110출장조리10
106652024042010118식육취급38

Duplicate rows

Most frequently occurring

처리일자청소년유해업소업종코드청소년유해업소업종명건수# duplicates
2922024042024201비디오물감상실업17
442024041210408철도역구내16
1832024041710201카바레26
3772024042324201비디오물감상실업16
352024041210201카바레25
872024041324201비디오물감상실업15
1422024041520399이용업 기타15
1642024041610413전통찻집15
2542024041910413전통찻집15
4362024042520399이용업 기타15