[Day13] 한 권으로 LLM 온라인 스터디 1기 - 효율적인 파라미터 튜닝 (양자화 & QLoRA)

프로그래밍/LLM

[Day13] 한 권으로 LLM 온라인 스터디 1기 - 효율적인 파라미터 튜닝 (양자화 & QLoRA)

31weeks 2025. 1. 26. 18:40

728x90

4.2 QLoRA 이론 및 실습

4.2.1 양자화의 이해

부동소수점의 개념
다양한 데이터 타입과 정밀도의 관계

- FP32 : 실수를 표현하는 표준적인 방식 중 하나, 단정밀도라고 불리며 32비트(4바이트) 사용, 매우 넓은 범위의 숫자 표현 가능, 0 주변의 숫자들을 더 세밀하게 표현할 수 있음(높은 정밀도), 메모리 사용량이 큰 편이고 대규모 모델이나 데이터셋을 다룰 때 제한요소가 될 수 있음.

- FP16 : 반정밀도라고도 불리는 숫자 표현 방식, 16비트 사용, FP32 보다 정밀도 낮고 표현할 수 있는 값의 범위가 좁다, 메모리 사용량이 적고 계산 효율성이 높음, 같은 메모리 공간에 더 많은 데이터를 저장할 수 있고 연산 속도도 빨라서대규모 머신러닝 모델 훈련이나 추론 과정에 자주 사용됨, 정밀도가 조금 떨어져도 큰 문제가 없는 경우 훈련 속도를 크게 높이고 메모리 사용량을 줄일 수 있음.

- BF16 : 16비트 부동소수점 형식의 한 변형, FP32와 동일한 범위의 수 표현 가능, 정밀도는 FP32 보다 낮음, FP16 보다는 높은 정밀도 제공, 딥러닝 모델 훈련에서 메모리 사용량을 줄이면서도 수치의 안정성을 유지할 수 있는 좋은 대안임, 계산 효율성과 모델 성능 사이의 균향을 잘 맞출 수 있기 때문에 큰 규모의 언어 모델 훈련에서 자주 활용됨.

https://en.wikipedia.org/wiki/Single-precision_floating-point_format

Single-precision floating-point format - Wikipedia

From Wikipedia, the free encyclopedia 32-bit computer number format Single-precision floating-point format (sometimes called FP32 or float32) is a computer number format, usually occupying 32 bits in computer memory; it represents a wide dynamic range of n

en.wikipedia.org

https://en.wikipedia.org/wiki/Half-precision_floating-point_format

Half-precision floating-point format - Wikipedia

From Wikipedia, the free encyclopedia 16-bit computer number format Not to be confused with bfloat16, a different 16-bit floating-point format. In computing, half precision (sometimes called FP16 or float16) is a binary floating-point computer number forma

en.wikipedia.org

https://en.wikipedia.org/wiki/Bfloat16_floating-point_format

bfloat16 floating-point format - Wikipedia

From Wikipedia, the free encyclopedia Not to be confused with binary16, a different 16-bit floating-point format. Floating-point number format used in computer processors The bfloat16 (brain floating point)[1][2] floating-point format is a computer number

en.wikipedia.org

BF16과 FP16의 차이점
- FP32를 FP16으로 변환할 때는 표현할 수 있는 수의 범위가 줄어들지만, BF16으로 변환할 떄는그 범위가 그대로 유지됨(대신, 숫자를 정밀하게 표현하는 능력을 일부 포기)
- 양자화는 컴퓨터가 다루는 숫자의 표현방식을 조정하는 기술
숫자 표현 범위를 줄이는 방법
- 대칭 양자화(Symmetric Quantization) : 0을 중심으로 양쪽으로 동일하게 숫자 범위를 줄임
- 비대칭 양자화(Asymmetric Quantization) : 데이터 분포에 따라 한쪽으로 더 치우치게 숫자 범위를조절
양자화 오류
- 실숫값을 정수로 변환하는 과정에서 불가피하게 발생하는 정보의 손실을 말함

4.2.2 런팟 환경 설정

H100PCIe x1
Pytorch 2.2.0
Container Disk 400GB
Volume Disk 400GB

4.2.3 데이터셋 준비

https://huggingface.co/datasets/daje/kotext-to-sql-v1

daje/kotext-to-sql-v1 · Datasets at Hugging Face

Below are sql tables schemas paired with instruction that describes a task. Using valid SQLite, write a response that appropriately completes the request for the provided tables. ### Instruction: Users with highest reputation both in SO and Math ( geometri

huggingface.co

4.2.4 양자화 파라미터 설정

양자화 설정을 위해 bITSaNDbYTEScONFIG 사용 → 모델이 사용하는 VRAM을 최소화 할 수 있음
주요 설정
- load_in_4bit=True : 모델을 4비트 정밀도로 로드함 → 메모리 사용량 크게 줄여줌
- bnb_4bit_use_double_quant=True : 이중 양자화 기법을 사용해 양자화로 인한 정확도 순실을 줄임
- bnb_4bit_quant_type="nf4" : nf4 양자화 방식 선택 → 정규화된 부동소수점 4비트 양자화를 의미함
- bnb_4bit_compute_dtype=torch.bfloat16 : 계산 시, bfloat16 형식 사용 → Llama 모델은 16비트 부동소수점으로 학습됐기 때문에 이 설정이 적합함

4.2.5 모델 준비

https://huggingface.co/allganize/Llama-3-Alpha-Ko-8B-Instruct

allganize/Llama-3-Alpha-Ko-8B-Instruct · Hugging Face

We are thrilled to introduce Alpha-Instruct, our latest language model, which demonstrates exceptional capabilities in both Korean and English. Alpha-Instruct is developed using the Evolutionary Model Merging technique, enabling it to excel in complex lang

huggingface.co

728x90

저작자표시 비영리 변경금지 (새창열림)

'프로그래밍 > LLM' 카테고리의 다른 글

[Day15] 한 권으로 LLM 온라인 스터디 1기 - vLLM 서빙 (1)	2025.01.26
[Day14] 한 권으로 LLM 온라인 스터디 1기 - QLoRA 튜닝 실습 (0)	2025.01.26
[Day12] 한 권으로 LLM 온라인 스터디 1기 - 효율적인 파라미터 튜닝 (LoRA 2) (0)	2025.01.26
[Day11] 한 권으로 LLM 온라인 스터디 1기 - 효율적인 파라미터 튜닝 (LoRA 1) (0)	2025.01.26
[Day10] 한 권으로 LLM 온라인 스터디 1기 - 다중 GPU Llama3 파인튜닝 (1)	2025.01.26

현재글[Day13] 한 권으로 LLM 온라인 스터디 1기 - 효율적인 파라미터 튜닝 (양자화 & QLoRA)

31weeks blog

250x250

작괘법, 암호화폐, 가스, 토정비결, 기술사, 가이드, MBTI, 이지함, 풀이, 트럼프, 원본해설, 괘상, 학습 방법, 문제풀이, 기출문제, 운세, 괘상수, 비트코인, 파이썬, 사주팔자,

Today :
Yesterday :

31weeks