Overview

Dataset statistics

Number of variables4
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows877
Duplicate rows (%)8.8%
Total size in memory419.9 KiB
Average record size in memory43.0 B

Variable types

Numeric3
Text1

Dataset

Description처리일자,청소년유해업소업종코드,청소년유해업소업종명,건수
Author성북구
URLhttps://data.seoul.go.kr/dataList/OA-11117/S/1/datasetView.do

Alerts

Dataset has 877 (8.8%) duplicate rowsDuplicates

Reproduction

Analysis started2024-04-06 12:37:21.960623
Analysis finished2024-04-06 12:37:24.720786
Duration2.76 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

처리일자
Real number (ℝ)

Distinct31
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20240332
Minimum20240306
Maximum20240405
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-04-06T21:37:24.884624image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum20240306
5-th percentile20240307
Q120240313
median20240321
Q320240329
95-th percentile20240404
Maximum20240405
Range99
Interquartile range (IQR)16

Descriptive statistics

Standard deviation31.787937
Coefficient of variation (CV)1.5705245 × 10-6
Kurtosis1.0644112
Mean20240332
Median Absolute Deviation (MAD)8
Skewness1.6562267
Sum2.0240332 × 1011
Variance1010.4729
MonotonicityNot monotonic
2024-04-06T21:37:25.214476image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
20240307 351
 
3.5%
20240331 348
 
3.5%
20240308 342
 
3.4%
20240321 338
 
3.4%
20240311 338
 
3.4%
20240330 335
 
3.4%
20240325 334
 
3.3%
20240401 331
 
3.3%
20240313 331
 
3.3%
20240328 330
 
3.3%
Other values (21) 6622
66.2%
ValueCountFrequency (%)
20240306 312
3.1%
20240307 351
3.5%
20240308 342
3.4%
20240309 318
3.2%
20240310 311
3.1%
20240311 338
3.4%
20240312 321
3.2%
20240313 331
3.3%
20240314 293
2.9%
20240315 295
2.9%
ValueCountFrequency (%)
20240405 315
3.1%
20240404 328
3.3%
20240403 305
3.0%
20240402 329
3.3%
20240401 331
3.3%
20240331 348
3.5%
20240330 335
3.4%
20240329 327
3.3%
20240328 330
3.3%
20240327 329
3.3%
Distinct59
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12912.314
Minimum10101
Maximum30111
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-04-06T21:37:25.473481image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum10101
5-th percentile10103
Q110113
median10210
Q310413
95-th percentile24201
Maximum30111
Range20010
Interquartile range (IQR)300

Descriptive statistics

Standard deviation5222.9006
Coefficient of variation (CV)0.40448989
Kurtosis1.332111
Mean12912.314
Median Absolute Deviation (MAD)107
Skewness1.6447743
Sum1.2912314 × 108
Variance27278690
MonotonicityNot monotonic
2024-04-06T21:37:25.774726image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10499 245
 
2.5%
10105 245
 
2.5%
10412 243
 
2.4%
10199 239
 
2.4%
10101 238
 
2.4%
10403 237
 
2.4%
10106 236
 
2.4%
10111 233
 
2.3%
20301 232
 
2.3%
10301 231
 
2.3%
Other values (49) 7621
76.2%
ValueCountFrequency (%)
10101 238
2.4%
10102 208
2.1%
10103 221
2.2%
10104 219
2.2%
10105 245
2.5%
10106 236
2.4%
10107 215
2.1%
10108 121
1.2%
10109 19
 
0.2%
10110 165
1.7%
ValueCountFrequency (%)
30111 119
1.2%
30110 71
 
0.7%
24205 222
2.2%
24201 197
2.0%
24113 204
2.0%
24101 96
1.0%
20399 147
1.5%
20301 232
2.3%
20199 97
1.0%
20107 201
2.0%
Distinct54
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-04-06T21:37:26.173274image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length9
Mean length4.2238
Min length2

Characters and Unicode

Total characters42238
Distinct characters128
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row기타
2nd row백화점
3rd row카바레
4th row뷔페식
5th row식육취급
ValueCountFrequency (%)
기타 908
 
8.9%
패스트푸드 457
 
4.5%
전통찻집 329
 
3.2%
관광호텔 273
 
2.7%
분식 245
 
2.4%
커피숍 243
 
2.4%
한식 238
 
2.3%
일반조리판매 237
 
2.3%
뷔페식 236
 
2.3%
일반이용업 232
 
2.3%
Other values (44) 6846
66.8%
2024-04-06T21:37:27.002454image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1814
 
4.3%
1599
 
3.8%
( 1206
 
2.9%
) 1206
 
2.9%
1070
 
2.5%
1003
 
2.4%
908
 
2.1%
908
 
2.1%
851
 
2.0%
786
 
1.9%
Other values (118) 30887
73.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 39367
93.2%
Open Punctuation 1206
 
2.9%
Close Punctuation 1206
 
2.9%
Space Separator 244
 
0.6%
Other Punctuation 215
 
0.5%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1814
 
4.6%
1599
 
4.1%
1070
 
2.7%
1003
 
2.5%
908
 
2.3%
908
 
2.3%
851
 
2.2%
786
 
2.0%
759
 
1.9%
738
 
1.9%
Other values (114) 28931
73.5%
Open Punctuation
ValueCountFrequency (%)
( 1206
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1206
100.0%
Space Separator
ValueCountFrequency (%)
244
100.0%
Other Punctuation
ValueCountFrequency (%)
, 215
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 39367
93.2%
Common 2871
 
6.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1814
 
4.6%
1599
 
4.1%
1070
 
2.7%
1003
 
2.5%
908
 
2.3%
908
 
2.3%
851
 
2.2%
786
 
2.0%
759
 
1.9%
738
 
1.9%
Other values (114) 28931
73.5%
Common
ValueCountFrequency (%)
( 1206
42.0%
) 1206
42.0%
244
 
8.5%
, 215
 
7.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 39367
93.2%
ASCII 2871
 
6.8%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1814
 
4.6%
1599
 
4.1%
1070
 
2.7%
1003
 
2.5%
908
 
2.3%
908
 
2.3%
851
 
2.2%
786
 
2.0%
759
 
1.9%
738
 
1.9%
Other values (114) 28931
73.5%
ASCII
ValueCountFrequency (%)
( 1206
42.0%
) 1206
42.0%
244
 
8.5%
, 215
 
7.5%

건수
Real number (ℝ)

Distinct737
Distinct (%)7.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean164.0037
Minimum1
Maximum4914
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-04-06T21:37:27.288529image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q14
median23.5
Q3140
95-th percentile683
Maximum4914
Range4913
Interquartile range (IQR)136

Descriptive statistics

Standard deviation408.32784
Coefficient of variation (CV)2.4897478
Kurtosis40.674789
Mean164.0037
Median Absolute Deviation (MAD)22.5
Skewness5.5358869
Sum1640037
Variance166731.63
MonotonicityNot monotonic
2024-04-06T21:37:27.989939image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1119
 
11.2%
2 744
 
7.4%
4 482
 
4.8%
3 380
 
3.8%
5 330
 
3.3%
7 204
 
2.0%
8 191
 
1.9%
6 167
 
1.7%
11 164
 
1.6%
14 141
 
1.4%
Other values (727) 6078
60.8%
ValueCountFrequency (%)
1 1119
11.2%
2 744
7.4%
3 380
 
3.8%
4 482
4.8%
5 330
 
3.3%
6 167
 
1.7%
7 204
 
2.0%
8 191
 
1.9%
9 134
 
1.3%
10 89
 
0.9%
ValueCountFrequency (%)
4914 3
< 0.1%
4913 1
 
< 0.1%
4912 1
 
< 0.1%
4907 2
< 0.1%
4903 3
< 0.1%
4898 2
< 0.1%
4896 1
 
< 0.1%
3228 3
< 0.1%
3226 3
< 0.1%
3224 3
< 0.1%

Interactions

2024-04-06T21:37:23.805912image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-06T21:37:22.662106image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-06T21:37:23.183855image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-06T21:37:23.985219image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-06T21:37:22.827349image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-06T21:37:23.369092image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-06T21:37:24.220446image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-06T21:37:23.015268image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-06T21:37:23.601254image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-06T21:37:28.149592image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
처리일자청소년유해업소업종코드청소년유해업소업종명건수
처리일자1.0000.0000.0000.000
청소년유해업소업종코드0.0001.0000.9990.142
청소년유해업소업종명0.0000.9991.0000.739
건수0.0000.1420.7391.000
2024-04-06T21:37:28.332829image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
처리일자청소년유해업소업종코드건수
처리일자1.000-0.0030.012
청소년유해업소업종코드-0.0031.000-0.270
건수0.012-0.2701.000

Missing values

2024-04-06T21:37:24.455952image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-06T21:37:24.636333image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

처리일자청소년유해업소업종코드청소년유해업소업종명건수
24802024030810499기타636
181602024032210409백화점20
13452024030710201카바레4
173292024032110106뷔페식10
28032024030810118식육취급28
106132024031524205노래연습장업214
83342024031310299기타2
212032024032510101한식1979
219642024032510410편의점108
270382024033010210간이주점3
처리일자청소년유해업소업종코드청소년유해업소업종명건수
262282024032910403일반조리판매245
206942024032410202고고(디스코)클럽1
112682024031610206스텐드바5
148662024031910299기타7
13702024030710101한식1254
234752024032720399이용업 기타1
311112024040310103경양식147
106572024031520105여관업45
180342024032210412커피숍457
63762024031110402다방77

Duplicate rows

Most frequently occurring

처리일자청소년유해업소업종코드청소년유해업소업종명건수# duplicates
92024030610202고고(디스코)클럽15
2662024031420399이용업 기타15
2682024031424201비디오물감상실업15
3852024031910202고고(디스코)클럽15
4882024032224201비디오물감상실업15
5102024032324201비디오물감상실업15
6582024032910202고고(디스코)클럽15
6682024032910413전통찻집15
7002024033010413전통찻집15
7512024040110202고고(디스코)클럽15