🔥알림🔥
① 테디노트 유튜브 - 구경하러 가기!
② LangChain 한국어 튜토리얼 바로가기 👀
③ 랭체인 노트 무료 전자책(wikidocs) 바로가기 🙌
④ RAG 비법노트 LangChain 강의오픈 바로가기 🙌
⑤ 서울대 PyTorch 딥러닝 강의 바로가기 🙌

[PyTorch] numpy로부터 텐서 변환(copying과 sharing의 차이)

2022년 02월 10일 1 분 소요

이번 포스팅에서는 Tensor의 기본 특징과 PyTorch에서 정의한 Tensor타입, PyTorch에서 numpy array를 tensor 변환시 3가지 함수 from_numpy(), as_tensor(), tensor()의 사용법과 그 차이점에 대하여 알아보도록 하겠습니다.

스칼라, 벡터, 메트릭스, 텐서

아래의 그림에서 잘 설명되어 있듯이

단일 값은 Scalar
1D(1차원) 데이터는 Vector
2D(행렬) 데이터는 Matrix
3차원 이상의 데이터는 Tensor

from IPython.display import Image

Image(url='https://mblogthumb-phinf.pstatic.net/MjAyMDA1MjVfMTM0/MDAxNTkwMzc4MTY4MDQy.iOzxIfhew8Bsto7uqNW3QYj-k9bysF775jXYLECD6bwg.uMJ87NPURvklkXF2TXFnygaSnc32erHm_mXbnKgvO24g.PNG.nabilera1/image.png?type=w800')

Tensor의 특성

torch.Tensor는 단일 데이터 타입(single data type)을 가집니다.
torch.Tensor 간의 연산은 같은 데이터타입일 경우에만 가능합니다.
Numpy의 배열 연산으로 수행할 수 있는 내용도, GPU를 활용하여 빠르게 학습하려는 경우 torch.Tensor로 변환할 수 있습니다.

필요한 모듈 import

# 모듈 import
import torch
import numpy as np

pytorch version 체크

print(torch.__version__)

1.10.2+cu113

Numpy로부터 Tensor 변환

방법1. torch.from_numpy()
방법2. torch.as_tensor()
방법3. torch.tensor()

# 샘플 데이터 생성
arr = np.array([1, 3, 5, 7, 9])
print(arr)
print(arr.dtype)
print(type(arr))

[1 3 5 7 9]
int64
<class 'numpy.ndarray'>

torch.from_numpy()는 torch.as_tensor()와 동일

`torch.from_numpy()` - sharing

t1 = torch.from_numpy(arr)
print(t1) # 출력
print(t1.dtype)  # dtype은 데이터 타입
print(t1.type()) # type()은 텐서의 타입
print(type(t1))  # t1 변수 자체의 타입

tensor([1, 3, 5, 7, 9])
torch.int64
torch.LongTensor
<class 'torch.Tensor'>

`torch.as_tensor()` - sharing

t2 = torch.as_tensor(arr)
print(t2) # 출력
print(t2.dtype)  # dtype은 데이터 타입
print(t2.type()) # type()은 텐서의 타입
print(type(t2))  # t2 변수 자체의 타입

tensor([1, 3, 5, 7, 9])
torch.int64
torch.LongTensor
<class 'torch.Tensor'>

# numpy array의 0번 index를 999로 값 변환
arr[0] = 999

# t1, t2 출력
print(f't1: {t1}')
print(f't2: {t2}')

t1: tensor([999,   3,   5,   7,   9])
t2: tensor([999,   3,   5,   7,   9])

torch.from_numpy()와 torch.as_tensor() 로 numpy array의 요소를 수정하게 되면 해당 numpy array로부터 생성된 tensor의 요소의 값이 변하는 것을 확인할 수 있습니다.

이러한 현상은 torch.from_numpy()와 torch.as_tensor() 모두 sharing 하기 때문입니다.

`torch.tensor()` - copying

# 샘플 데이터 초기화
arr = np.array([1, 3, 5, 7, 9])
print(arr)
print(arr.dtype)
print(type(arr))

[1 3 5 7 9]
int64
<class 'numpy.ndarray'>

t3 = torch.tensor(arr)
print(t3) # 출력
print(t3.dtype)  # dtype은 데이터 타입
print(t3.type()) # type()은 텐서의 타입
print(type(t3))  # t3 변수 자체의 타입

tensor([1, 3, 5, 7, 9])
torch.int64
torch.LongTensor
<class 'torch.Tensor'>

# numpy array의 0번 index를 999로 값 변환
arr[0] = 999

# t3 출력
print(f't3: {t3}')

t3: tensor([1, 3, 5, 7, 9])

torch.tensor()로 numpy array를 변환시 sharing이 아닌 copying하기 때문에 원본 numpy array의 요소가 변하더라고 tensor에 영향을 끼치지 않음을 확인할 수 있습니다.

텐서의 데이터 타입

pytorch 도큐먼트 링크

TYPE	NAME	EQUIVALENT	TENSOR TYPE
32-bit integer (signed)	torch.int32	torch.int	IntTensor
64-bit integer (signed)	torch.int64	torch.long	LongTensor
16-bit integer (signed)	torch.int16	torch.short	ShortTensor
32-bit floating point	torch.float32	torch.float	FloatTensor
64-bit floating point	torch.float64	torch.double	DoubleTensor
16-bit floating point	torch.float16	torch.half	HalfTensor
8-bit integer (signed)	torch.int8		CharTensor
8-bit integer (unsigned)	torch.uint8		ByteTensor

Twitter Facebook LinkedIn

[PyTorch] numpy로부터 텐서 변환(copying과 sharing의 차이)

스칼라, 벡터, 메트릭스, 텐서

Tensor의 특성

필요한 모듈 import

Numpy로부터 Tensor 변환

`torch.tensor()` - copying

텐서의 데이터 타입

공유하기

댓글남기기

참고

poetry 의 거의 모든것 (튜토리얼)

LangGraph Retrieval Agent를 활용한 동적 문서 검색 및 처리

[Assistants API] Code Interpreter, Retrieval, Functions 활용법

[LangChain] 에이전트(Agent)와 도구(tools)를 활용한 지능형 검색 시스템 구축 가이드

스칼라, 벡터, 메트릭스, 텐서

Tensor의 특성

필요한 모듈 import

Numpy로부터 Tensor 변환

torch.from_numpy() - sharing

torch.as_tensor() - sharing

torch.tensor() - copying

텐서의 데이터 타입

공유하기

댓글남기기

참고

poetry 의 거의 모든것 (튜토리얼)

LangGraph Retrieval Agent를 활용한 동적 문서 검색 및 처리

[Assistants API] Code Interpreter, Retrieval, Functions 활용법

[LangChain] 에이전트(Agent)와 도구(tools)를 활용한 지능형 검색 시스템 구축 가이드

`torch.from_numpy()` - sharing

`torch.as_tensor()` - sharing

`torch.tensor()` - copying