행 단위로 반복하면서 팬더의 데이터 프레임 업데이트

development

행 단위로 반복하면서 팬더의 데이터 프레임 업데이트

big-blog 2020. 6. 6. 07:59

행 단위로 반복하면서 팬더의 데이터 프레임 업데이트

나는 이와 같은 팬더 데이터 프레임을 가지고 있습니다 (꽤 큰 것)

           date      exer exp     ifor         mat  
1092  2014-03-17  American   M  528.205  2014-04-19 
1093  2014-03-17  American   M  528.205  2014-04-19 
1094  2014-03-17  American   M  528.205  2014-04-19 
1095  2014-03-17  American   M  528.205  2014-04-19    
1096  2014-03-17  American   M  528.205  2014-05-17

이제 행 단위로 반복하고 싶습니다. 각 행을 통과 할 때 각 행의 값은 ifor일부 조건에 따라 변경 될 수 있으며 다른 데이터 프레임을 찾아야합니다.

이제 반복하면서 이것을 어떻게 업데이트합니까? 그들 중 누구도 효과가 없었던 몇 가지를 시도했습니다.

for i, row in df.iterrows():
    if <something>:
        row['ifor'] = x
    else:
        row['ifor'] = y

    df.ix[i]['ifor'] = x

이러한 접근 방식 중 어느 것도 효과가없는 것 같습니다. 데이터 프레임에서 업데이트 된 값이 표시되지 않습니다.

df.set_value를 사용하여 루프에 값을 할당 할 수 있습니다.

for i, row in df.iterrows():
  ifor_val = something
  if <condition>:
    ifor_val = something_else
  df.set_value(i,'ifor',ifor_val)

행 값이 필요하지 않으면 단순히 df의 인덱스를 반복 할 수 있지만 여기에 표시되지 않은 행 값이 필요한 경우 원래 for 루프를 유지했습니다.

최신 정보

df.set_value ()는 버전 0.21.0부터 더 이상 사용되지 않습니다. 대신 df.at ()를 사용할 수 있습니다.

  for i, row in df.iterrows():
      ifor_val = something
      if <condition>:
        ifor_val = something_else
      df.at[i,'ifor'] = ifor_val

Pandas DataFrame 객체는 Series of Series로 간주해야합니다. 다시 말해, 열의 관점에서 생각해야합니다. 이것이 중요한 이유는 사용할 때 pd.DataFrame.iterrows행을 Series로 반복하고 있기 때문 입니다. 그러나 이들은 데이터 프레임이 저장하는 시리즈 가 아니므로 반복하는 동안 생성 된 새로운 시리즈입니다. 즉, 할당을 시도하면 해당 편집 내용이 원래 데이터 프레임에 반영되지 않습니다.

자, 이제 그 길을 벗어났습니다 : 우리는 무엇을합니까?

이 게시물 이전의 제안은 다음과 같습니다.

pd.DataFrame.set_value되는 팬더 버전 0.21 추천되지
pd.DataFrame.ix되어 사용되지
pd.DataFrame.loc괜찮지 만 배열 인덱서에서 작동 할 수 있으며 더 잘 할 수 있습니다

내 추천
사용pd.DataFrame.at

for i in df.index:
    if <something>:
        df.at[i, 'ifor'] = x
    else:
        df.at[i, 'ifor'] = y

이것을 다음과 같이 변경할 수도 있습니다.

for i in df.index:
    df.at[i, 'ifor'] = x if <something> else y

의견에 대한 답변

if 조건에 대해 이전 행의 값을 사용해야하는 경우 어떻게해야합니까?

for i in range(1, len(df) + 1):
    j = df.columns.get_loc('ifor')
    if <something>:
        df.iat[i - 1, j] = x
    else:
        df.iat[i - 1, j] = y

사용할 수있는 방법은 itertuples()입니다. 튜플의 첫 번째 요소로 인덱스 값을 사용하여 명명 된 튜플로 DataFrame 행을 반복합니다. 그리고에 비해 훨씬 빠릅니다 iterrows(). 의 경우 itertuples()각각 DataFrame에 row포함 되며 값을 설정하는 데 Index사용할 수 있습니다 loc.

for row in df.itertuples():
    if <something>:
        df.at[row.Index, 'ifor'] = x
    else:
        df.at[row.Index, 'ifor'] = x

    df.loc[row.Index, 'ifor'] = x

사용 .at이 훨씬 빠릅니다 @SantiStSupery 감사합니다 .

로 df.ix[i, 'exp']=X또는 df.loc[i, 'exp']=X대신에 값을 할당해야합니다 df.ix[i]['ifor'] = x.

그렇지 않으면보기를 작업 중이며 따뜻하게해야합니다.

-c:1: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_index,col_indexer] = value instead

그러나 확실히 DataFrame@Phillip Cloud가 제안한대로 루프를 벡터화 알고리즘으로 대체하는 것이 좋습니다.

Well, if you are going to iterate anyhow, why don't use the simplest method of all, df['Column'].values[i]

df['Column'] = ''

for i in range(len(df)):
    df['Column'].values[i] = something/update/new_value

Or if you want to compare the new values with old or anything like that, why not store it in a list and then append in the end.

mylist, df['Column'] = [], ''

for <condition>:
    mylist.append(something/update/new_value)

df['Column'] = mylist

for i, row in df.iterrows():
    if <something>:
        df.at[i, 'ifor'] = x
    else:
        df.at[i, 'ifor'] = y

Increment the MAX number from a column. For Example :

df1 = [sort_ID, Column1,Column2]
print(df1)

My output :

Sort_ID Column1 Column2
12         a    e
45         b    f
65         c    g
78         d    h

MAX = df1['Sort_ID'].max() #This returns my Max Number

Now , I need to create a column in df2 and fill the column values which increments the MAX .

Sort_ID Column1 Column2
79      a1       e1
80      b1       f1
81      c1       g1
82      d1       h1

_{Note : df2 will initially contain only the Column1 and Column2 . we need the Sortid column to be created and incremental of the MAX from df1 .}

참고URL : https://stackoverflow.com/questions/23330654/update-a-dataframe-in-pandas-while-iterating-row-by-row

'development' 카테고리의 다른 글

SQL에서 테이블의 스키마 이름 변경 (0)	2020.06.07
경고 : 내장 함수 'xyz'의 호환되지 않는 암시 적 선언 (0)	2020.06.07
WebKit / Blink에서 MacOS 트랙 패드 사용자에게 스크롤 막대가 숨겨지지 않도록 방지 (0)	2020.06.06
Promise.all : 해결 된 값의 순서 (0)	2020.06.06
선언해도 Android 권한이 작동하지 않습니다 (0)	2020.06.06

현재글행 단위로 반복하면서 팬더의 데이터 프레임 업데이트

big-blog

행 단위로 반복하면서 팬더의 데이터 프레임 업데이트

행 단위로 반복하면서 팬더의 데이터 프레임 업데이트

의견에 대한 답변

'development' 카테고리의 다른 글

'development'의 다른글

티스토리툴바

행 단위로 반복하면서 팬더의 데이터 프레임 업데이트

행 단위로 반복하면서 팬더의 데이터 프레임 업데이트

의견에 대한 답변

'development' 카테고리의 다른 글

'development'의 다른글

관련글

티스토리툴바