Overview

Dataset statistics

Number of variables4
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows862
Duplicate rows (%)8.6%
Total size in memory419.9 KiB
Average record size in memory43.0 B

Variable types

Numeric3
Text1

Dataset

Description처리일자,청소년유해업소업종코드,청소년유해업소업종명,건수
Author동작구
URLhttps://data.seoul.go.kr/dataList/OA-10578/S/1/datasetView.do

Alerts

Dataset has 862 (8.6%) duplicate rowsDuplicates

Reproduction

Analysis started2024-04-06 10:07:23.205212
Analysis finished2024-04-06 10:07:25.721366
Duration2.52 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

처리일자
Real number (ℝ)

Distinct31
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20240332
Minimum20240306
Maximum20240405
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-04-06T19:07:25.856703image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum20240306
5-th percentile20240307
Q120240313
median20240321
Q320240329
95-th percentile20240404
Maximum20240405
Range99
Interquartile range (IQR)16

Descriptive statistics

Standard deviation31.964509
Coefficient of variation (CV)1.5792482 × 10-6
Kurtosis0.98963614
Mean20240332
Median Absolute Deviation (MAD)8
Skewness1.6364834
Sum2.0240332 × 1011
Variance1021.7299
MonotonicityNot monotonic
2024-04-06T19:07:26.092782image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
20240404 353
 
3.5%
20240316 345
 
3.5%
20240319 343
 
3.4%
20240330 342
 
3.4%
20240325 341
 
3.4%
20240403 340
 
3.4%
20240318 338
 
3.4%
20240308 337
 
3.4%
20240331 337
 
3.4%
20240329 335
 
3.4%
Other values (21) 6589
65.9%
ValueCountFrequency (%)
20240306 332
3.3%
20240307 305
3.0%
20240308 337
3.4%
20240309 332
3.3%
20240310 315
3.1%
20240311 309
3.1%
20240312 284
2.8%
20240313 306
3.1%
20240314 307
3.1%
20240315 327
3.3%
ValueCountFrequency (%)
20240405 314
3.1%
20240404 353
3.5%
20240403 340
3.4%
20240402 317
3.2%
20240401 309
3.1%
20240331 337
3.4%
20240330 342
3.4%
20240329 335
3.4%
20240328 334
3.3%
20240327 320
3.2%
Distinct59
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12917.577
Minimum10101
Maximum30111
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-04-06T19:07:26.402171image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum10101
5-th percentile10103
Q110113
median10210
Q310413
95-th percentile24201
Maximum30111
Range20010
Interquartile range (IQR)300

Descriptive statistics

Standard deviation5223.3651
Coefficient of variation (CV)0.40436107
Kurtosis1.262429
Mean12917.577
Median Absolute Deviation (MAD)107
Skewness1.6302714
Sum1.2917577 × 108
Variance27283543
MonotonicityNot monotonic
2024-04-06T19:07:26.717076image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10410 244
 
2.4%
20301 243
 
2.4%
10102 241
 
2.4%
10112 240
 
2.4%
10111 238
 
2.4%
10402 237
 
2.4%
10107 236
 
2.4%
10301 235
 
2.4%
10411 235
 
2.4%
10208 233
 
2.3%
Other values (49) 7618
76.2%
ValueCountFrequency (%)
10101 225
2.2%
10102 241
2.4%
10103 228
2.3%
10104 229
2.3%
10105 229
2.3%
10106 213
2.1%
10107 236
2.4%
10108 107
1.1%
10109 19
 
0.2%
10110 177
1.8%
ValueCountFrequency (%)
30111 125
1.2%
30110 55
 
0.5%
24205 215
2.1%
24201 212
2.1%
24113 219
2.2%
24101 101
1.0%
20399 152
1.5%
20301 243
2.4%
20199 81
 
0.8%
20107 187
1.9%
Distinct54
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-04-06T19:07:27.100923image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length9
Mean length4.2487
Min length2

Characters and Unicode

Total characters42487
Distinct characters128
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row다방
2nd row기타
3rd row이용업 기타
4th row일식
5th row여관업
ValueCountFrequency (%)
기타 809
 
7.9%
패스트푸드 473
 
4.6%
전통찻집 301
 
2.9%
관광호텔 270
 
2.6%
편의점 244
 
2.4%
일반이용업 243
 
2.4%
중국식 241
 
2.4%
호프(소주방 240
 
2.3%
다방 237
 
2.3%
정종,대포집(선술집 236
 
2.3%
Other values (44) 6939
67.8%
2024-04-06T19:07:27.887389image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1819
 
4.3%
1586
 
3.7%
) 1215
 
2.9%
( 1215
 
2.9%
1108
 
2.6%
1032
 
2.4%
879
 
2.1%
809
 
1.9%
809
 
1.9%
786
 
1.8%
Other values (118) 31229
73.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 39588
93.2%
Close Punctuation 1215
 
2.9%
Open Punctuation 1215
 
2.9%
Other Punctuation 236
 
0.6%
Space Separator 233
 
0.5%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1819
 
4.6%
1586
 
4.0%
1108
 
2.8%
1032
 
2.6%
879
 
2.2%
809
 
2.0%
809
 
2.0%
786
 
2.0%
773
 
2.0%
771
 
1.9%
Other values (114) 29216
73.8%
Close Punctuation
ValueCountFrequency (%)
) 1215
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1215
100.0%
Other Punctuation
ValueCountFrequency (%)
, 236
100.0%
Space Separator
ValueCountFrequency (%)
233
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 39588
93.2%
Common 2899
 
6.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1819
 
4.6%
1586
 
4.0%
1108
 
2.8%
1032
 
2.6%
879
 
2.2%
809
 
2.0%
809
 
2.0%
786
 
2.0%
773
 
2.0%
771
 
1.9%
Other values (114) 29216
73.8%
Common
ValueCountFrequency (%)
) 1215
41.9%
( 1215
41.9%
, 236
 
8.1%
233
 
8.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 39588
93.2%
ASCII 2899
 
6.8%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1819
 
4.6%
1586
 
4.0%
1108
 
2.8%
1032
 
2.6%
879
 
2.2%
809
 
2.0%
809
 
2.0%
786
 
2.0%
773
 
2.0%
771
 
1.9%
Other values (114) 29216
73.8%
ASCII
ValueCountFrequency (%)
) 1215
41.9%
( 1215
41.9%
, 236
 
8.1%
233
 
8.0%

건수
Real number (ℝ)

Distinct719
Distinct (%)7.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean160.1578
Minimum1
Maximum4913
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-04-06T19:07:28.228717image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q14
median24
Q3139
95-th percentile630.05
Maximum4913
Range4912
Interquartile range (IQR)135

Descriptive statistics

Standard deviation400.15731
Coefficient of variation (CV)2.498519
Kurtosis41.165809
Mean160.1578
Median Absolute Deviation (MAD)23
Skewness5.5950166
Sum1601578
Variance160125.87
MonotonicityNot monotonic
2024-04-06T19:07:28.503901image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1126
 
11.3%
2 694
 
6.9%
4 438
 
4.4%
3 367
 
3.7%
5 321
 
3.2%
7 217
 
2.2%
8 197
 
2.0%
6 170
 
1.7%
11 159
 
1.6%
9 143
 
1.4%
Other values (709) 6168
61.7%
ValueCountFrequency (%)
1 1126
11.3%
2 694
6.9%
3 367
 
3.7%
4 438
 
4.4%
5 321
 
3.2%
6 170
 
1.7%
7 217
 
2.2%
8 197
 
2.0%
9 143
 
1.4%
10 86
 
0.9%
ValueCountFrequency (%)
4913 2
 
< 0.1%
4912 1
 
< 0.1%
4908 1
 
< 0.1%
4903 3
< 0.1%
4902 1
 
< 0.1%
4896 3
< 0.1%
3228 5
0.1%
3226 4
< 0.1%
3225 2
 
< 0.1%
3224 1
 
< 0.1%

Interactions

2024-04-06T19:07:24.912321image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-06T19:07:23.841214image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-06T19:07:24.358512image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-06T19:07:25.078864image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-06T19:07:24.003326image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-06T19:07:24.558236image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-06T19:07:25.253532image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-06T19:07:24.185100image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-06T19:07:24.746984image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-06T19:07:28.717490image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
처리일자청소년유해업소업종코드청소년유해업소업종명건수
처리일자1.0000.0000.0000.000
청소년유해업소업종코드0.0001.0000.9990.143
청소년유해업소업종명0.0000.9991.0000.739
건수0.0000.1430.7391.000
2024-04-06T19:07:28.936085image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
처리일자청소년유해업소업종코드건수
처리일자1.0000.002-0.004
청소년유해업소업종코드0.0021.000-0.276
건수-0.004-0.2761.000

Missing values

2024-04-06T19:07:25.488966image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-06T19:07:25.642684image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

처리일자청소년유해업소업종코드청소년유해업소업종명건수
152682024031910402다방123
338132024040510199기타513
316092024040320399이용업 기타8
322172024040410104일식175
20692024030720105여관업15
172652024032110412커피숍205
216962024032520101관광호텔1
174142024032124205노래연습장업188
151142024031910108전통찻집2
117432024031610201카바레4
처리일자청소년유해업소업종코드청소년유해업소업종명건수
302072024040224205노래연습장업188
251042024032810402다방14
35772024030910101한식2095
265552024033010499기타118
295302024040110410편의점218
322782024040410201카바레1
158532024032010301단란주점67
39142024030910119탕류7
97512024031410404관광호텔3
313392024040310402다방6

Duplicate rows

Most frequently occurring

처리일자청소년유해업소업종코드청소년유해업소업종명건수# duplicates
7932024040310413전통찻집17
742024030824201비디오물감상실업16
1422024031024201비디오물감상실업16
1682024031110413전통찻집16
6002024032720399이용업 기타16
1902024031210409백화점15
3382024031810201카바레25
4162024032024201비디오물감상실업15
5062024032410413전통찻집15
6422024032910202고고(디스코)클럽15