Overview

Dataset statistics

Number of variables4
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows882
Duplicate rows (%)8.8%
Total size in memory419.9 KiB
Average record size in memory43.0 B

Variable types

Numeric3
Text1

Dataset

Description처리일자,청소년유해업소업종코드,청소년유해업소업종명,건수
Author용산구
URLhttps://data.seoul.go.kr/dataList/OA-11194/S/1/datasetView.do

Alerts

Dataset has 882 (8.8%) duplicate rowsDuplicates

Reproduction

Analysis started2024-03-13 12:00:58.686456
Analysis finished2024-03-13 12:01:00.266056
Duration1.58 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

처리일자
Real number (ℝ)

Distinct29
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20240249
Minimum20240211
Maximum20240310
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-03-13T21:01:00.329403image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum20240211
5-th percentile20240212
Q120240218
median20240225
Q320240303
95-th percentile20240309
Maximum20240310
Range99
Interquartile range (IQR)85

Descriptive statistics

Standard deviation40.8236
Coefficient of variation (CV)2.0169515 × 10-6
Kurtosis-1.5313378
Mean20240249
Median Absolute Deviation (MAD)10
Skewness0.64030607
Sum2.0240249 × 1011
Variance1666.5663
MonotonicityNot monotonic
2024-03-13T21:01:00.466519image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=29)
ValueCountFrequency (%)
20240211 382
 
3.8%
20240222 378
 
3.8%
20240216 368
 
3.7%
20240217 367
 
3.7%
20240223 360
 
3.6%
20240218 358
 
3.6%
20240301 354
 
3.5%
20240307 354
 
3.5%
20240212 353
 
3.5%
20240228 349
 
3.5%
Other values (19) 6377
63.8%
ValueCountFrequency (%)
20240211 382
3.8%
20240212 353
3.5%
20240213 340
3.4%
20240214 333
3.3%
20240215 324
3.2%
20240216 368
3.7%
20240217 367
3.7%
20240218 358
3.6%
20240219 309
3.1%
20240220 345
3.5%
ValueCountFrequency (%)
20240310 329
3.3%
20240309 337
3.4%
20240308 346
3.5%
20240307 354
3.5%
20240306 326
3.3%
20240305 339
3.4%
20240304 345
3.5%
20240303 338
3.4%
20240302 339
3.4%
20240301 354
3.5%
Distinct59
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12843.953
Minimum10101
Maximum30111
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-03-13T21:01:00.613184image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum10101
5-th percentile10103
Q110113
median10210
Q310413
95-th percentile24201
Maximum30111
Range20010
Interquartile range (IQR)300

Descriptive statistics

Standard deviation5184.5992
Coefficient of variation (CV)0.40366071
Kurtosis1.4768275
Mean12843.953
Median Absolute Deviation (MAD)107
Skewness1.6881197
Sum1.2843953 × 108
Variance26880068
MonotonicityNot monotonic
2024-03-13T21:01:00.804867image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10402 255
 
2.5%
20105 253
 
2.5%
10411 245
 
2.5%
10111 242
 
2.4%
10403 241
 
2.4%
10301 238
 
2.4%
10105 235
 
2.4%
10104 234
 
2.3%
10103 233
 
2.3%
10101 232
 
2.3%
Other values (49) 7592
75.9%
ValueCountFrequency (%)
10101 232
2.3%
10102 228
2.3%
10103 233
2.3%
10104 234
2.3%
10105 235
2.4%
10106 217
2.2%
10107 215
2.1%
10108 113
1.1%
10109 22
 
0.2%
10110 178
1.8%
ValueCountFrequency (%)
30111 129
1.3%
30110 57
 
0.6%
24205 216
2.2%
24201 216
2.2%
24113 219
2.2%
24101 67
 
0.7%
20399 127
1.3%
20301 220
2.2%
20199 68
 
0.7%
20107 199
2.0%
Distinct54
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-03-13T21:01:01.069272image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length9
Mean length4.2191
Min length2

Characters and Unicode

Total characters42191
Distinct characters128
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row패스트푸드
2nd row기타
3rd row간이주점
4th row노래연습장업
5th row백화점
ValueCountFrequency (%)
기타 813
 
8.0%
패스트푸드 487
 
4.8%
전통찻집 324
 
3.2%
관광호텔 270
 
2.6%
다방 255
 
2.5%
여관업 253
 
2.5%
일반조리판매 241
 
2.4%
단란주점 238
 
2.3%
분식 235
 
2.3%
일식 234
 
2.3%
Other values (44) 6845
67.1%
2024-03-13T21:01:01.439383image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1771
 
4.2%
1626
 
3.9%
) 1186
 
2.8%
( 1186
 
2.8%
1084
 
2.6%
991
 
2.3%
850
 
2.0%
837
 
2.0%
813
 
1.9%
813
 
1.9%
Other values (118) 31034
73.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 39409
93.4%
Close Punctuation 1186
 
2.8%
Open Punctuation 1186
 
2.8%
Other Punctuation 215
 
0.5%
Space Separator 195
 
0.5%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1771
 
4.5%
1626
 
4.1%
1084
 
2.8%
991
 
2.5%
850
 
2.2%
837
 
2.1%
813
 
2.1%
813
 
2.1%
754
 
1.9%
714
 
1.8%
Other values (114) 29156
74.0%
Close Punctuation
ValueCountFrequency (%)
) 1186
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1186
100.0%
Other Punctuation
ValueCountFrequency (%)
, 215
100.0%
Space Separator
ValueCountFrequency (%)
195
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 39409
93.4%
Common 2782
 
6.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1771
 
4.5%
1626
 
4.1%
1084
 
2.8%
991
 
2.5%
850
 
2.2%
837
 
2.1%
813
 
2.1%
813
 
2.1%
754
 
1.9%
714
 
1.8%
Other values (114) 29156
74.0%
Common
ValueCountFrequency (%)
) 1186
42.6%
( 1186
42.6%
, 215
 
7.7%
195
 
7.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 39409
93.4%
ASCII 2782
 
6.6%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1771
 
4.5%
1626
 
4.1%
1084
 
2.8%
991
 
2.5%
850
 
2.2%
837
 
2.1%
813
 
2.1%
813
 
2.1%
754
 
1.9%
714
 
1.8%
Other values (114) 29156
74.0%
ASCII
ValueCountFrequency (%)
) 1186
42.6%
( 1186
42.6%
, 215
 
7.7%
195
 
7.0%

건수
Real number (ℝ)

Distinct700
Distinct (%)7.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean154.2253
Minimum1
Maximum4899
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-03-13T21:01:01.586716image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q14
median24
Q3130.25
95-th percentile585.9
Maximum4899
Range4898
Interquartile range (IQR)126.25

Descriptive statistics

Standard deviation381.27061
Coefficient of variation (CV)2.4721664
Kurtosis39.755878
Mean154.2253
Median Absolute Deviation (MAD)23
Skewness5.4750341
Sum1542253
Variance145367.28
MonotonicityNot monotonic
2024-03-13T21:01:01.784170image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1091
 
10.9%
2 696
 
7.0%
4 483
 
4.8%
3 386
 
3.9%
5 338
 
3.4%
8 202
 
2.0%
7 200
 
2.0%
6 170
 
1.7%
9 150
 
1.5%
14 147
 
1.5%
Other values (690) 6137
61.4%
ValueCountFrequency (%)
1 1091
10.9%
2 696
7.0%
3 386
 
3.9%
4 483
4.8%
5 338
 
3.4%
6 170
 
1.7%
7 200
 
2.0%
8 202
 
2.0%
9 150
 
1.5%
10 108
 
1.1%
ValueCountFrequency (%)
4899 2
< 0.1%
4897 2
< 0.1%
4896 2
< 0.1%
4892 2
< 0.1%
3226 1
 
< 0.1%
3223 2
< 0.1%
3221 1
 
< 0.1%
3126 2
< 0.1%
3125 2
< 0.1%
3123 3
< 0.1%

Interactions

2024-03-13T21:00:59.710227image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-13T21:00:59.040200image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-13T21:00:59.315091image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-13T21:00:59.827077image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-13T21:00:59.124557image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-13T21:00:59.455966image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-13T21:00:59.979466image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-13T21:00:59.222674image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-13T21:00:59.614573image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-03-13T21:01:01.880519image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
처리일자청소년유해업소업종코드청소년유해업소업종명건수
처리일자1.0000.0000.0000.000
청소년유해업소업종코드0.0001.0000.9990.130
청소년유해업소업종명0.0000.9991.0000.739
건수0.0000.1300.7391.000
2024-03-13T21:01:01.992479image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
처리일자청소년유해업소업종코드건수
처리일자1.000-0.0050.004
청소년유해업소업종코드-0.0051.000-0.268
건수0.004-0.2681.000

Missing values

2024-03-13T21:01:00.112308image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-13T21:01:00.222552image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

처리일자청소년유해업소업종코드청소년유해업소업종명건수
196992024022210411패스트푸드49
239432024021810499기타118
319892024021110210간이주점9
96602024030224205노래연습장업222
67882024030410409백화점1
78322024030310102중국식334
15742024030910401과자점4
198312024022210411패스트푸드36
11902024030910116생선회14
252682024021710114복어취급11
처리일자청소년유해업소업종코드청소년유해업소업종명건수
12212024030910403일반조리판매98
167412024022420102일반호텔6
312062024021110206스텐드바7
42702024030710199기타511
83422024030310101한식1546
108952024030120101관광호텔3
244612024021710117까페20
165552024022510115김밥(도시락)24
52912024030610199기타1446
195452024022210102중국식77

Duplicate rows

Most frequently occurring

처리일자청소년유해업소업종코드청소년유해업소업종명건수# duplicates
5762024022924201비디오물감상실업17
5952024030110408철도역구내16
7622024030624201비디오물감상실업16
342024021120399이용업 기타15
2212024021710413전통찻집15
3072024022010413전통찻집15
3492024022124201비디오물감상실업15
3982024022310210간이주점15
4692024022610202고고(디스코)클럽15
4952024022710201카바레25