Overview

Dataset statistics

Number of variables3
Number of observations21
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory678.0 B
Average record size in memory32.3 B

Variable types

Text1
Numeric1
Categorical1

Dataset

Description경상북도 경주시공공서비스 예약 세외수입정보입니다.경상북도 경주시공공서비스 예약 세외수입정보입니다.경상북도 경주시공공서비스 예약 세외수입정보입니다.
Author경상북도 경주시
URLhttps://www.data.go.kr/data/15089740/fileData.do

Alerts

세외수입연계코드 is highly imbalanced (72.4%)Imbalance

Reproduction

Analysis started2024-04-17 19:09:14.887199
Analysis finished2024-04-17 19:09:15.150706
Duration0.26 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct17
Distinct (%)81.0%
Missing0
Missing (%)0.0%
Memory size300.0 B
2024-04-18T04:09:15.253779image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length11
Median length8
Mean length6.3809524
Min length4

Characters and Unicode

Total characters134
Distinct characters60
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique13 ?
Unique (%)61.9%

Sample

1st row청소년수련관
2nd row테니스장
3rd row외동읍민체육회관
4th row경주베이스볼파크2구장
5th row국제문화교류관
ValueCountFrequency (%)
청소년수련관 2
 
9.1%
안강문화회관 2
 
9.1%
내남면체육공원 2
 
9.1%
황성공원 2
 
9.1%
테니스장 1
 
4.5%
국제문화교류관 1
 
4.5%
형산강체육공원 1
 
4.5%
경주베이스볼파크2구장 1
 
4.5%
불국체육센터 1
 
4.5%
서라벌문화회관 1
 
4.5%
Other values (8) 8
36.4%
2024-04-18T04:09:15.514354image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
7
 
5.2%
7
 
5.2%
7
 
5.2%
7
 
5.2%
7
 
5.2%
5
 
3.7%
5
 
3.7%
4
 
3.0%
4
 
3.0%
4
 
3.0%
Other values (50) 77
57.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 132
98.5%
Decimal Number 1
 
0.7%
Space Separator 1
 
0.7%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
7
 
5.3%
7
 
5.3%
7
 
5.3%
7
 
5.3%
7
 
5.3%
5
 
3.8%
5
 
3.8%
4
 
3.0%
4
 
3.0%
4
 
3.0%
Other values (48) 75
56.8%
Decimal Number
ValueCountFrequency (%)
2 1
100.0%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 132
98.5%
Common 2
 
1.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
7
 
5.3%
7
 
5.3%
7
 
5.3%
7
 
5.3%
7
 
5.3%
5
 
3.8%
5
 
3.8%
4
 
3.0%
4
 
3.0%
4
 
3.0%
Other values (48) 75
56.8%
Common
ValueCountFrequency (%)
2 1
50.0%
1
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 132
98.5%
ASCII 2
 
1.5%

Most frequent character per block

Hangul
ValueCountFrequency (%)
7
 
5.3%
7
 
5.3%
7
 
5.3%
7
 
5.3%
7
 
5.3%
5
 
3.8%
5
 
3.8%
4
 
3.0%
4
 
3.0%
4
 
3.0%
Other values (48) 75
56.8%
ASCII
ValueCountFrequency (%)
2 1
50.0%
1
50.0%

부서코드
Real number (ℝ)

Distinct13
Distinct (%)61.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5050236.6
Minimum5050056
Maximum5050344
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size321.0 B
2024-04-18T04:09:15.612179image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum5050056
5-th percentile5050058
Q15050062
median5050309
Q35050337
95-th percentile5050337
Maximum5050344
Range288
Interquartile range (IQR)275

Descriptive statistics

Standard deviation119.60873
Coefficient of variation (CV)2.3683787 × 10-5
Kurtosis-1.2802416
Mean5050236.6
Median Absolute Deviation (MAD)29
Skewness-0.80260168
Sum1.0605497 × 108
Variance14306.248
MonotonicityNot monotonic
2024-04-18T04:09:15.692749image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%)
5050337 5
23.8%
5050310 3
14.3%
5050059 2
 
9.5%
5050062 2
 
9.5%
5050344 1
 
4.8%
5050280 1
 
4.8%
5050309 1
 
4.8%
5050322 1
 
4.8%
5050272 1
 
4.8%
5050192 1
 
4.8%
Other values (3) 3
14.3%
ValueCountFrequency (%)
5050056 1
 
4.8%
5050058 1
 
4.8%
5050059 2
9.5%
5050062 2
9.5%
5050192 1
 
4.8%
5050272 1
 
4.8%
5050279 1
 
4.8%
5050280 1
 
4.8%
5050309 1
 
4.8%
5050310 3
14.3%
ValueCountFrequency (%)
5050344 1
 
4.8%
5050337 5
23.8%
5050322 1
 
4.8%
5050310 3
14.3%
5050309 1
 
4.8%
5050280 1
 
4.8%
5050279 1
 
4.8%
5050272 1
 
4.8%
5050192 1
 
4.8%
5050062 2
 
9.5%

세외수입연계코드
Categorical

IMBALANCE 

Distinct2
Distinct (%)9.5%
Missing0
Missing (%)0.0%
Memory size300.0 B
211099
20 
211001
 
1

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique1 ?
Unique (%)4.8%

Sample

1st row211099
2nd row211099
3rd row211099
4th row211099
5th row211099

Common Values

ValueCountFrequency (%)
211099 20
95.2%
211001 1
 
4.8%

Length

2024-04-18T04:09:15.782984image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-18T04:09:15.854389image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
211099 20
95.2%
211001 1
 
4.8%

Interactions

2024-04-18T04:09:14.977157image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-18T04:09:15.902086image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
부서명부서코드세외수입연계코드
부서명1.0000.6870.000
부서코드0.6871.0000.000
세외수입연계코드0.0000.0001.000
2024-04-18T04:09:15.965132image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
부서코드세외수입연계코드
부서코드1.0000.000
세외수입연계코드0.0001.000

Missing values

2024-04-18T04:09:15.069351image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-18T04:09:15.125348image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

부서명부서코드세외수입연계코드
0청소년수련관5050344211099
1테니스장5050337211099
2외동읍민체육회관5050059211099
3경주베이스볼파크2구장5050337211099
4국제문화교류관5050280211099
5형산강체육공원5050337211099
6내남면체육공원5050062211099
7불국체육센터5050337211099
8안강문화회관5050309211099
9서라벌문화회관5050322211099
부서명부서코드세외수입연계코드
11안강문화회관5050310211099
12알천축구장5050272211099
13안강생활체육공원5050310211099
14외동생활체육공원5050059211099
15안강운동장5050310211099
16청소년수련관5050192211099
17내남면체육공원5050062211001
18경주시 건천읍5050058211099
19황성공원5050279211099
20감포축구장5050056211099