Overview

Dataset statistics

Number of variables4
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows918
Duplicate rows (%)9.2%
Total size in memory419.9 KiB
Average record size in memory43.0 B

Variable types

Numeric3
Text1

Dataset

Description처리일자,청소년유해업소업종코드,청소년유해업소업종명,건수
Author관악구
URLhttps://data.seoul.go.kr/dataList/OA-11502/S/1/datasetView.do

Alerts

Dataset has 918 (9.2%) duplicate rowsDuplicates

Reproduction

Analysis started2024-04-06 13:20:53.626016
Analysis finished2024-04-06 13:20:55.343972
Duration1.72 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

처리일자
Real number (ℝ)

Distinct31
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20240332
Minimum20240306
Maximum20240405
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-04-06T22:20:55.423092image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum20240306
5-th percentile20240307
Q120240313
median20240321
Q320240329
95-th percentile20240404
Maximum20240405
Range99
Interquartile range (IQR)16

Descriptive statistics

Standard deviation31.899216
Coefficient of variation (CV)1.5760224 × 10-6
Kurtosis1.0516326
Mean20240332
Median Absolute Deviation (MAD)8
Skewness1.6558406
Sum2.0240332 × 1011
Variance1017.56
MonotonicityNot monotonic
2024-04-06T22:20:55.569673image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
20240306 346
 
3.5%
20240404 343
 
3.4%
20240314 338
 
3.4%
20240307 338
 
3.4%
20240315 338
 
3.4%
20240328 332
 
3.3%
20240329 332
 
3.3%
20240320 330
 
3.3%
20240317 330
 
3.3%
20240311 330
 
3.3%
Other values (21) 6643
66.4%
ValueCountFrequency (%)
20240306 346
3.5%
20240307 338
3.4%
20240308 315
3.1%
20240309 324
3.2%
20240310 324
3.2%
20240311 330
3.3%
20240312 309
3.1%
20240313 323
3.2%
20240314 338
3.4%
20240315 338
3.4%
ValueCountFrequency (%)
20240405 312
3.1%
20240404 343
3.4%
20240403 319
3.2%
20240402 322
3.2%
20240401 319
3.2%
20240331 295
2.9%
20240330 301
3.0%
20240329 332
3.3%
20240328 332
3.3%
20240327 304
3.0%
Distinct59
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12883.416
Minimum10101
Maximum30111
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-04-06T22:20:55.712684image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum10101
5-th percentile10103
Q110113
median10210
Q310413
95-th percentile24201
Maximum30111
Range20010
Interquartile range (IQR)300

Descriptive statistics

Standard deviation5200.1357
Coefficient of variation (CV)0.4036302
Kurtosis1.3368641
Mean12883.416
Median Absolute Deviation (MAD)107
Skewness1.6520919
Sum1.2883416 × 108
Variance27041411
MonotonicityNot monotonic
2024-04-06T22:20:55.868871image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10499 248
 
2.5%
10116 248
 
2.5%
24205 247
 
2.5%
10118 244
 
2.4%
10403 242
 
2.4%
20301 240
 
2.4%
20101 236
 
2.4%
10199 234
 
2.3%
10106 234
 
2.3%
10102 234
 
2.3%
Other values (49) 7593
75.9%
ValueCountFrequency (%)
10101 217
2.2%
10102 234
2.3%
10103 215
2.1%
10104 199
2.0%
10105 205
2.1%
10106 234
2.3%
10107 211
2.1%
10108 112
1.1%
10109 17
 
0.2%
10110 200
2.0%
ValueCountFrequency (%)
30111 121
1.2%
30110 57
 
0.6%
24205 247
2.5%
24201 200
2.0%
24113 224
2.2%
24101 69
 
0.7%
20399 144
1.4%
20301 240
2.4%
20199 99
1.0%
20107 177
1.8%
Distinct54
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-04-06T22:20:56.078246image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length9
Mean length4.2464
Min length2

Characters and Unicode

Total characters42464
Distinct characters128
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row백화점
2nd row까페
3rd row전통찻집
4th row관광호텔
5th row여관업
ValueCountFrequency (%)
기타 889
 
8.7%
패스트푸드 448
 
4.4%
전통찻집 327
 
3.2%
관광호텔 281
 
2.7%
생선회 248
 
2.4%
노래연습장업 247
 
2.4%
식육취급 244
 
2.4%
일반조리판매 242
 
2.4%
일반이용업 240
 
2.3%
뷔페식 234
 
2.3%
Other values (44) 6843
66.8%
2024-04-06T22:20:56.420136image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1780
 
4.2%
1558
 
3.7%
( 1185
 
2.8%
) 1185
 
2.8%
1085
 
2.6%
1016
 
2.4%
889
 
2.1%
889
 
2.1%
886
 
2.1%
781
 
1.8%
Other values (118) 31210
73.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 39640
93.3%
Open Punctuation 1185
 
2.8%
Close Punctuation 1185
 
2.8%
Space Separator 243
 
0.6%
Other Punctuation 211
 
0.5%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1780
 
4.5%
1558
 
3.9%
1085
 
2.7%
1016
 
2.6%
889
 
2.2%
889
 
2.2%
886
 
2.2%
781
 
2.0%
750
 
1.9%
749
 
1.9%
Other values (114) 29257
73.8%
Open Punctuation
ValueCountFrequency (%)
( 1185
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1185
100.0%
Space Separator
ValueCountFrequency (%)
243
100.0%
Other Punctuation
ValueCountFrequency (%)
, 211
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 39640
93.3%
Common 2824
 
6.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1780
 
4.5%
1558
 
3.9%
1085
 
2.7%
1016
 
2.6%
889
 
2.2%
889
 
2.2%
886
 
2.2%
781
 
2.0%
750
 
1.9%
749
 
1.9%
Other values (114) 29257
73.8%
Common
ValueCountFrequency (%)
( 1185
42.0%
) 1185
42.0%
243
 
8.6%
, 211
 
7.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 39640
93.3%
ASCII 2824
 
6.7%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1780
 
4.5%
1558
 
3.9%
1085
 
2.7%
1016
 
2.6%
889
 
2.2%
889
 
2.2%
886
 
2.2%
781
 
2.0%
750
 
1.9%
749
 
1.9%
Other values (114) 29257
73.8%
ASCII
ValueCountFrequency (%)
( 1185
42.0%
) 1185
42.0%
243
 
8.6%
, 211
 
7.5%

건수
Real number (ℝ)

Distinct734
Distinct (%)7.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean156.7415
Minimum1
Maximum4914
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-04-06T22:20:56.561147image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q14
median22
Q3132
95-th percentile612.75
Maximum4914
Range4913
Interquartile range (IQR)128

Descriptive statistics

Standard deviation398.94932
Coefficient of variation (CV)2.5452692
Kurtosis42.913551
Mean156.7415
Median Absolute Deviation (MAD)21
Skewness5.708663
Sum1567415
Variance159160.56
MonotonicityNot monotonic
2024-04-06T22:20:56.720263image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1119
 
11.2%
2 679
 
6.8%
4 507
 
5.1%
3 403
 
4.0%
5 339
 
3.4%
7 211
 
2.1%
8 200
 
2.0%
11 168
 
1.7%
6 167
 
1.7%
9 151
 
1.5%
Other values (724) 6056
60.6%
ValueCountFrequency (%)
1 1119
11.2%
2 679
6.8%
3 403
 
4.0%
4 507
5.1%
5 339
 
3.4%
6 167
 
1.7%
7 211
 
2.1%
8 200
 
2.0%
9 151
 
1.5%
10 116
 
1.2%
ValueCountFrequency (%)
4914 2
< 0.1%
4913 1
 
< 0.1%
4908 1
 
< 0.1%
4907 1
 
< 0.1%
4905 1
 
< 0.1%
4903 4
< 0.1%
4897 1
 
< 0.1%
4896 1
 
< 0.1%
3228 2
< 0.1%
3226 2
< 0.1%

Interactions

2024-04-06T22:20:54.763347image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-06T22:20:54.023524image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-06T22:20:54.408931image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-06T22:20:54.882144image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-06T22:20:54.155533image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-06T22:20:54.550854image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-06T22:20:55.059521image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-06T22:20:54.291549image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-06T22:20:54.666151image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-06T22:20:56.821391image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
처리일자청소년유해업소업종코드청소년유해업소업종명건수
처리일자1.0000.0000.0000.020
청소년유해업소업종코드0.0001.0000.9990.133
청소년유해업소업종명0.0000.9991.0000.739
건수0.0200.1330.7391.000
2024-04-06T22:20:56.933552image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
처리일자청소년유해업소업종코드건수
처리일자1.0000.0040.003
청소년유해업소업종코드0.0041.000-0.250
건수0.003-0.2501.000

Missing values

2024-04-06T22:20:55.202054image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-06T22:20:55.295449image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

처리일자청소년유해업소업종코드청소년유해업소업종명건수
299982024040210409백화점79
292162024040110117까페108
338672024040510413전통찻집2
106462024031510404관광호텔2
7952024030620105여관업36
273702024033010119탕류1
119342024031610108전통찻집1
234082024032710115김밥(도시락)4
146562024031920399이용업 기타1
176652024032110116생선회26
처리일자청소년유해업소업종코드청소년유해업소업종명건수
333702024040510114복어취급2
321232024040410408철도역구내1
147872024031910111패스트푸드17
28972024030810102중국식204
42472024030910112호프(소주방)448
295402024040124113일반게임제공업12
189662024032310409백화점20
300802024040210105분식229
196752024032310210간이주점2
257712024032924113일반게임제공업2

Duplicate rows

Most frequently occurring

처리일자청소년유해업소업종코드청소년유해업소업종명건수# duplicates
4582024032010413전통찻집17
8902024040424201비디오물감상실업17
7282024032910413전통찻집16
142024030610409백화점15
242024030624201비디오물감상실업15
1632024031110206스텐드바15
2732024031420399이용업 기타15
3942024031810413전통찻집15
4312024031920399이용업 기타15
4852024032110409백화점15