minimalism

VSCODE 초기화 및 재설치하기 1. 기존 세팅 저장 혹시 모를 상황을 대비해 기존 설치된 extension 목록과 'setting.json' 값(게시물 상단)을 캡쳐해두었다. 2. 플러그인 삭제 c:\user\사용자이름\.vscode 폴더를 삭제해 주면 된다. 3. 개인설정 파일 삭제 c:\user\사용자이름\AppData\Roaming\Code 마찬가지로 폴더를 삭제해주면 되는데 이때, AppData 폴더가 보이지 않으면 '숨김'설정이 되어있을 가능성이 높다. 4. 제어판에서 VSCODE uninstall 5. VSCODE 재설치 VSCODE 설치 위 링크를 눌러 다운 받으면 된다. 출처 : https://bit.ly/3Cjsf6J

Computer Science/프로그래밍 2022. 8. 22. 12:51

[ML] 다항회귀 (Polynomial regression)

다항 회귀 (Polynomial Regression) y=w0+w1x+w2x2+⋯+wdxdy=w0+w1x+w2x2+⋯+wdxd 독립변수의 차수를 높이는 형태 다차원의 회귀식인 다항 회귀 분석으로 단순 선형 모델의 한계를 어느정도 극복할 수 있음. 함수가 비선형, 데이터가 곡선 형태일 경우 예측에 유리 데이터에 각 특성의 제곱을 추가해주어서 특성이 추가된 비선형 데이터를 선형 회귀 모델로 훈련시키는 방법 보통 2차함수는 중간에 하강하므로 3차(cubic) 함수부터 아니면 단조증가하는 제곱근이나 로그 함수를 많이 쓴다. 다항 회귀도 결국 xd=Xdxd=Xd로 뒀을 때 다중 회귀식의 일종이라고 볼 수 있다. %matplotlib inline import numpy as np import pandas as pd..

Artificial Intelligence/Machine Learning 2022. 8. 19. 11:20

[Kaggle] 타이타닉 생존자 예측

라이브러리 호출 import numpy as np import pandas as pd from sklearn.preprocessing import LabelEncoder from sklearn.preprocessing import MinMaxScaler, StandardScaler 데이터 불러오기 train_set = pd.read_csv('/content/train.csv',index_col=0) test_set = pd.read_csv('/content/test.csv',index_col=0) train_set.head() 데이터를 불러온 후, 잘 불러와졌는지 확인해보았다. 데이터 정보 확인 train_set.info() 이후 어떤 데이터를 전처리 해야하는지 알아보기 위해 .info()를 이용해서 데..

Artificial Intelligence/캐글-데이콘 2022. 8. 13. 15:23

[ML] XGBoost 개념

Boosting 이란? 여러 개의 약한 의사결정나무(Decision Tree)를 조합해서 사용하는 앙상블(Ensemble) 기법 중 하나 즉, 약한 예측 모형들의 학습 에러에 가중치를 두고, 순차적으로 다음 학습 모델에 반영하여 강한 예측모형을 만든다. XGBoost 란? XGBoost는 Extreme Gradient Boosting의 약자이다. Boosting 기법을 이용하여 구현한 알고리즘은 Gradient Boost 가 대표적 이 알고리즘을 병렬 학습이 지원되도록 구현한 라이브러리가 XGBoost 이다. Regression, Classification 문제를 모두 지원하며, 성능과 자원 효율이 좋아서, 인기 있게 사용되는 알고리즘 이다. XGBoost의 장점 [1] GBM 대비 빠른 수행시간 병렬 ..

Artificial Intelligence/Machine Learning 2022. 8. 11. 16:59

[ML] 이상치(Outlier) 처리-(Tukey Outlier)

import numpy as np aaa = np.array([1,2,-10,4,5,6,7,8,50,10]) def outliers(data_out): quartile_1, q2, quartile_3 = np.percentile(data_out, [25,50,75]) # 하위 25% 위치 값 Q1 # 하위 50% 위치 값 Q2 (중앙값) # 하위 75% 위치 값 Q3 print("1사분위 : ", quartile_1) print("q2 : ", q2) print("3사분위 : ", quartile_3) iqr = quartile_3 -quartile_1 print("iqr : ", iqr) # 왜 빼는지 명시 lower_bound = quartile_1 - (iqr * 1.5) upper_bound = ..

Artificial Intelligence/Machine Learning 2022. 8. 10. 19:56

[ML] XGB plot_importance

import numpy as np from sklearn.datasets import load_diabetes from sklearn.model_selection import train_test_split datasets =load_diabetes() x = datasets.data y = datasets.target x_train, x_test, y_train,y_test = train_test_split(x,y, train_size=0.8,shuffle=True, random_state=123) from xgboost import XGBRegressor model = XGBRegressor() model.fit(x_train, y_train) print(model, ':', model.feature_..

Artificial Intelligence/Machine Learning 2022. 8. 10. 14:29

[ML] Data Preprocessing - Missing Value (결측치 처리)

결측치란? 대부분의 머신러닝 알고리즘은 Missing feature, 즉 누락된 데이터가 있을 때, 제대로 역할을 하지 못합니다. 그래서 먼저 Missing feature에 대해 처리해주어야 합니다. Missing feature, NA(Not Available) : '결측치'라고 하며 값이 표기되지 않은 값 결측치의 종류 Random : 패턴이 없는 무작위 값 No Random : 패턴을 가진 결측치 결측치 처리 전략 제거 (Deletion) 대치 (Imputation) 예측 모델 (Prediction model) 결측치 확인 결측치 여부 확인 df["col"].isnull() 결측치 개수 확인 df["col"].isnull().value_counts() 제거 (Deletion) 결측치의 특성이 '무작위..

Artificial Intelligence/Machine Learning 2022. 8. 10. 14:28

상관관계 시각화 히트맵

import numpy as np import pandas as pd from sklearn.datasets import load_iris datasets = load_iris() print(datasets.feature_names) #['sepal length (cm)', 'sepal width (cm)', 'petal length (cm)', 'petal width (cm)'] x = datasets['data'] y = datasets['target'] df = pd.DataFrame(x, columns=[['sepal length', 'sepal width', 'petal length', 'petal width']]) # 컬럼명을 사용하기 위해 넘파이 => 판다스 형태로 바꿔줌 # 컬럼명은 히트맵..

Computer Science/Python 2022. 8. 9. 10:24

[파이썬] String startswith(), 어떤 문자열로 시작하는지 확인

startswith()를 이용하여 문자열이 특정 문자열로 시작하는지 확인할 수 있다. 1. startswith()로 문자열이 특정 문자열로 시작하는지 확인 2. startswith()와 split()으로 단어가 특정 문자열로 시작하는지 확인 1. startswith()로 문자열이 특정 문자열로 시작하는지 확인 예를 들어 다음과 같이 'Hello world, Python!'가 Hello로 시작하는지 확인할 수 있다. str = 'Hello world, Python!' if str.startswith('Hello'): print('It starts with Hello') if not str.startswith('Python'): print('It does not start with Python') Outpu..

Computer Science/Python 2022. 8. 8. 18:09

[파이썬] Pandas DataFrame을 numpy 배열로 변환하는 방법

.values 또는 .to_numpy() 를 사용해 DataFrame을 numpy 배열 형식으로 변환할 수 있다. 예제 코드 import pandas as pd # DataFrame 생성 data = [['Choi',22],['Kim',48],['Joo',32]] df = pd.DataFrame(data, columns=['Name','Age']) # .values 또는 .to_numpy() 를 사용해 numpy 배열로 변환 print(df.values) print(df.to_numpy()) #[['Choi' 22] # ['Kim' 48] # ['Joo' 32]] https://wooono.tistory.com/175

Computer Science/Python 2022. 8. 8. 15:41

« 2025/06 »
일	월	화	수	목	금	토
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30

티스토리툴바