CamelCase를 snake_case로 변환하는 우아한 Python 함수?

development

CamelCase를 snake_case로 변환하는 우아한 Python 함수?

big-blog 2020. 3. 4. 07:52

CamelCase를 snake_case로 변환하는 우아한 Python 함수?

예:

>>> convert('CamelCase')
'camel_case'

이것은 매우 철저합니다.

def convert(name):
    s1 = re.sub('(.)([A-Z][a-z]+)', r'\1_\2', name)
    return re.sub('([a-z0-9])([A-Z])', r'\1_\2', s1).lower()

이 모든 것들과 함께 작동하며 이미 낙타되지 않은 버전에는 영향을 미치지 않습니다.

>>> convert('CamelCase')
'camel_case'
>>> convert('CamelCamelCase')
'camel_camel_case'
>>> convert('Camel2Camel2Case')
'camel2_camel2_case'
>>> convert('getHTTPResponseCode')
'get_http_response_code'
>>> convert('get2HTTPResponseCode')
'get2_http_response_code'
>>> convert('HTTPResponseCode')
'http_response_code'
>>> convert('HTTPResponseCodeXYZ')
'http_response_code_xyz'

또는 그것을 거대한 시간이라고 부르려면 정규 표현식을 미리 컴파일 할 수 있습니다.

first_cap_re = re.compile('(.)([A-Z][a-z]+)')
all_cap_re = re.compile('([a-z0-9])([A-Z])')
def convert(name):
    s1 = first_cap_re.sub(r'\1_\2', name)
    return all_cap_re.sub(r'\1_\2', s1).lower()

정규식 모듈을 가져 오는 것을 잊지 마십시오

import re

패키지 인덱스에는 이러한 것들을 처리 할 수 있는 변곡 라이브러리 가 있습니다. 이 경우 다음을 찾고 있습니다 inflection.underscore().

>>> inflection.underscore('CamelCase')
'camel_case'

왜 이것들이 그렇게 복잡한 지 모르겠습니다.

대부분의 경우 간단한 표현으로 ([A-Z]+)트릭을 수행합니다.

>>> re.sub('([A-Z]+)', r'_\1','CamelCase').lower()
'_camel_case'  
>>> re.sub('([A-Z]+)', r'_\1','camelCase').lower()
'camel_case'
>>> re.sub('([A-Z]+)', r'_\1','camel2Case2').lower()
'camel2_case2'
>>> re.sub('([A-Z]+)', r'_\1','camelCamelCase').lower()
'camel_camel_case'
>>> re.sub('([A-Z]+)', r'_\1','getHTTPResponseCode').lower()
'get_httpresponse_code'

첫 번째 문자를 무시하려면 뒤에 외모를 추가하십시오. (?!^)

>>> re.sub('(?!^)([A-Z]+)', r'_\1','CamelCase').lower()
'camel_case'
>>> re.sub('(?!^)([A-Z]+)', r'_\1','CamelCamelCase').lower()
'camel_camel_case'
>>> re.sub('(?!^)([A-Z]+)', r'_\1','Camel2Camel2Case').lower()
'camel2_camel2_case'
>>> re.sub('(?!^)([A-Z]+)', r'_\1','getHTTPResponseCode').lower()
'get_httpresponse_code'

ALLCap을 all_caps로 분리하고 문자열에서 숫자를 기대하려면 여전히 두 개의 별도 런을 수행 할 필요가 없습니다. |이 표현식 ((?<=[a-z0-9])[A-Z]|(?!^)[A-Z](?=[a-z]))은 책의 모든 시나리오를 처리 할 수 있습니다.

>>> a = re.compile('((?<=[a-z0-9])[A-Z]|(?!^)[A-Z](?=[a-z]))')
>>> a.sub(r'_\1', 'getHTTPResponseCode').lower()
'get_http_response_code'
>>> a.sub(r'_\1', 'get2HTTPResponseCode').lower()
'get2_http_response_code'
>>> a.sub(r'_\1', 'get2HTTPResponse123Code').lower()
'get2_http_response123_code'
>>> a.sub(r'_\1', 'HTTPResponseCode').lower()
'http_response_code'
>>> a.sub(r'_\1', 'HTTPResponseCodeXYZ').lower()
'http_response_code_xyz'

그것은 모두 당신이 원하는 것에 달려 있으므로 지나치게 복잡해서는 안되므로 귀하의 요구에 가장 적합한 솔루션을 사용하십시오.

조이!

개인적으로 파이썬에서 정규 표현식을 사용하는 것이 어떻게 우아하게 묘사 될 수 있는지 잘 모르겠습니다. 여기서 대부분의 답변은 "코드 골프"유형 RE 트릭을 수행하는 것입니다. 우아한 코딩은 쉽게 이해할 수 있어야합니다.

def to_snake_case(not_snake_case):
    final = ''
    for i in xrange(len(not_snake_case)):
        item = not_snake_case[i]
        if i < len(not_snake_case) - 1:
            next_char_will_be_underscored = (
                not_snake_case[i+1] == "_" or
                not_snake_case[i+1] == " " or
                not_snake_case[i+1].isupper()
            )
        if (item == " " or item == "_") and next_char_will_be_underscored:
            continue
        elif (item == " " or item == "_"):
            final += "_"
        elif item.isupper():
            final += "_"+item.lower()
        else:
            final += item
    if final[0] == "_":
        final = final[1:]
    return final

>>> to_snake_case("RegularExpressionsAreFunky")
'regular_expressions_are_funky'

>>> to_snake_case("RegularExpressionsAre Funky")
'regular_expressions_are_funky'

>>> to_snake_case("RegularExpressionsAre_Funky")
'regular_expressions_are_funky'

stringcase 는 이것에 대한 나의 도서관입니다. 예 :

>>> from stringcase import pascalcase, snakecase
>>> snakecase('FooBarBaz')
'foo_bar_baz'
>>> pascalcase('foo_bar_baz')
'FooBarBaz'

''.join('_'+c.lower() if c.isupper() else c for c in "DeathToCamelCase").strip('_')
re.sub("(.)([A-Z])", r'\1_\2', 'DeathToCamelCase').lower()

왜 두 개의 .sub () 호출을 사용하는지 모르겠습니다. :) 나는 정규 표현식 전문가가 아니지만 특정 요구 사항에 적합한이 기능으로 단순화 시켰습니다 .POST 요청에서 vars_with_underscore로 camelCasedVars를 변환하는 솔루션이 필요했습니다.

def myFunc(...):
  return re.sub('(.)([A-Z]{1})', r'\1_\2', "iTriedToWriteNicely").lower()

getHTTPResponse와 같은 이름으로 작동하지 않으므로 이름 지정 규칙이 잘못되었다고 들었습니다 (getHttpResponse와 비슷해야 함).이 양식을 훨씬 쉽게 암기 할 수 있습니다.

이 솔루션이 이전 답변보다 더 간단하다고 생각합니다.

import re

def convert (camel_input):
    words = re.findall(r'[A-Z]?[a-z]+|[A-Z]{2,}(?=[A-Z][a-z]|\d|\W|$)|\d+', camel_input)
    return '_'.join(map(str.lower, words))


# Let's test it
test_strings = [
    'CamelCase',
    'camelCamelCase',
    'Camel2Camel2Case',
    'getHTTPResponseCode',
    'get200HTTPResponseCode',
    'getHTTP200ResponseCode',
    'HTTPResponseCode',
    'ResponseHTTP',
    'ResponseHTTP2',
    'Fun?!awesome',
    'Fun?!Awesome',
    '10CoolDudes',
    '20coolDudes'
]
for test_string in test_strings:
    print(convert(test_string))

어떤 출력 :

camel_case
camel_camel_case
camel_2_camel_2_case
get_http_response_code
get_200_http_response_code
get_http_200_response_code
http_response_code
response_http
response_http_2
fun_awesome
fun_awesome
10_cool_dudes
20_cool_dudes

정규식은 세 가지 패턴과 일치합니다.

[A-Z]?[a-z]+: 선택적으로 대문자로 시작하는 연속적인 소문자입니다.
[A-Z]{2,}(?=[A-Z][a-z]|\d|\W|$): 대문자 두 개 이상의 연속 된 대문자. 마지막 대문자가 소문자 뒤에 오는 경우 lookahead를 사용하여 마지막 대문자를 제외합니다.
\d+: 연속 번호.

을 사용하여 re.findall소문자로 변환하고 밑줄로 결합 할 수있는 개별 "단어"목록을 얻습니다.

내 해결책은 다음과 같습니다.

def un_camel(text):
    """ Converts a CamelCase name into an under_score name. 

        >>> un_camel('CamelCase')
        'camel_case'
        >>> un_camel('getHTTPResponseCode')
        'get_http_response_code'
    """
    result = []
    pos = 0
    while pos < len(text):
        if text[pos].isupper():
            if pos-1 > 0 and text[pos-1].islower() or pos-1 > 0 and \
            pos+1 < len(text) and text[pos+1].islower():
                result.append("_%s" % text[pos].lower())
            else:
                result.append(text[pos].lower())
        else:
            result.append(text[pos])
        pos += 1
    return "".join(result)

의견에서 논의 된 코너 사례를 지원합니다. 예를 들어, 원하는대로 변환 getHTTPResponseCode됩니다 get_http_response_code.

그것의 재미를 위해 :

>>> def un_camel(input):
...     output = [input[0].lower()]
...     for c in input[1:]:
...             if c in ('ABCDEFGHIJKLMNOPQRSTUVWXYZ'):
...                     output.append('_')
...                     output.append(c.lower())
...             else:
...                     output.append(c)
...     return str.join('', output)
...
>>> un_camel("camel_case")
'camel_case'
>>> un_camel("CamelCase")
'camel_case'

또는 재미를 위해 더 많은 것 :

>>> un_camel = lambda i: i[0].lower() + str.join('', ("_" + c.lower() if c in "ABCDEFGHIJKLMNOPQRSTUVWXYZ" else c for c in i[1:]))
>>> un_camel("camel_case")
'camel_case'
>>> un_camel("CamelCase")
'camel_case'

표준 라이브러리에는 없지만 필요한 기능이 포함 된 것으로 보이는 이 스크립트 를 찾았습니다 .

이것은 우아한 방법이 아니며 간단한 상태 머신 (비트 필드 상태 머신)의 매우 '낮은 레벨'구현입니다.이를 해결하기 위해 가장 안티 파이 토닉 모드이지만 re 모듈은이 간단한 문제를 해결하기에는 너무 복잡한 상태 머신을 구현합니다 작업이므로 이것이 좋은 해결책이라고 생각합니다.

def splitSymbol(s):
    si, ci, state = 0, 0, 0 # start_index, current_index 
    '''
        state bits:
        0: no yields
        1: lower yields
        2: lower yields - 1
        4: upper yields
        8: digit yields
        16: other yields
        32 : upper sequence mark
    '''
    for c in s:

        if c.islower():
            if state & 1:
                yield s[si:ci]
                si = ci
            elif state & 2:
                yield s[si:ci - 1]
                si = ci - 1
            state = 4 | 8 | 16
            ci += 1

        elif c.isupper():
            if state & 4:
                yield s[si:ci]
                si = ci
            if state & 32:
                state = 2 | 8 | 16 | 32
            else:
                state = 8 | 16 | 32

            ci += 1

        elif c.isdigit():
            if state & 8:
                yield s[si:ci]
                si = ci
            state = 1 | 4 | 16
            ci += 1

        else:
            if state & 16:
                yield s[si:ci]
            state = 0
            ci += 1  # eat ci
            si = ci   
        print(' : ', c, bin(state))
    if state:
        yield s[si:ci] 


def camelcaseToUnderscore(s):
    return '_'.join(splitSymbol(s))

splitsymbol은 UpperSEQUENCEInterleaved, under_score, BIG_SYMBOLS 및 cammelCasedMethods와 같은 모든 케이스 유형을 구문 분석 할 수 있습니다.

도움이 되길 바랍니다.

너무 많은 복잡한 방법 ... "Titled"그룹을 모두 찾아서 소문자 변형을 밑줄로 결합하십시오.

>>> import re
>>> def camel_to_snake(string):
...     groups = re.findall('([A-z0-9][a-z]*)', string)
...     return '_'.join([i.lower() for i in groups])
...
>>> camel_to_snake('ABCPingPongByTheWay2KWhereIsOurBorderlands3???')
'a_b_c_ping_pong_by_the_way_2_k_where_is_our_borderlands_3'

그룹의 첫 번째 문자 또는 별도의 그룹과 같은 숫자를 원하지 않으면 ([A-z][a-z0-9]*)마스크 를 사용할 수 있습니다 .

정규식을 사용하는 것이 가장 짧을 수 있지만이 솔루션은 더 읽기 쉽습니다.

def to_snake_case(s):
    snake = "".join(["_"+c.lower() if c.isupper() else c for c in s])
    return snake[1:] if snake.startswith("_") else snake

발전기를 사용 하는 https://stackoverflow.com/users/267781/matth 에서 가볍게 채택되었습니다 .

def uncamelize(s):
    buff, l = '', []
    for ltr in s:
        if ltr.isupper():
            if buff:
                l.append(buff)
                buff = ''
        buff += ltr
    l.append(buff)
    return '_'.join(l).lower()

뛰어난 Schematics 라이브러리를 살펴보십시오

https://github.com/schematics/schematics

파이썬에서 자바 스크립트 풍미로 직렬화 / 직렬화 할 수있는 형식화 된 데이터 구조를 만들 수 있습니다.

class MapPrice(Model):
    price_before_vat = DecimalType(serialized_name='priceBeforeVat')
    vat_rate = DecimalType(serialized_name='vatRate')
    vat = DecimalType()
    total_price = DecimalType(serialized_name='totalPrice')

가능한 경우 다시 피하는 것을 선호합니다.

myString="ThisStringIsCamelCase" ''.join(['_'+i.lower() if i.isupper() else i for i in myString]).lstrip('_') 'this_string_is_camel_case'

와우 방금이 코드를 장고 스 니펫에서 훔쳤습니다. 심판 http://djangosnippets.org/snippets/585/

꽤 우아한

camelcase_to_underscore = lambda str: re.sub(r'(?<=[a-z])[A-Z]|[A-Z](?=[^A-Z])', r'_\g<0>', str).lower().strip('_')

예:

camelcase_to_underscore('ThisUser')

보고:

'this_user'

정규식 데모

정규 표현식을 사용하는 끔찍한 예 ( 이를 쉽게 정리 할 수 있습니다 :)) :

def f(s):
    return s.group(1).lower() + "_" + s.group(2).lower()

p = re.compile("([A-Z]+[a-z]+)([A-Z]?)")
print p.sub(f, "CamelCase")
print p.sub(f, "getHTTPResponseCode")

getHTTPResponseCode에서 작동합니다!

또는 람다를 사용하십시오.

p = re.compile("([A-Z]+[a-z]+)([A-Z]?)")
print p.sub(lambda x: x.group(1).lower() + "_" + x.group(2).lower(), "CamelCase")
print p.sub(lambda x: x.group(1).lower() + "_" + x.group(2).lower(), "getHTTPResponseCode")

편집 : 밑줄이 무조건 삽입되기 때문에 "Test"와 같은 경우에 개선의 여지가 있음을 쉽게 알 수 있습니다.

탭으로 구분 된 파일에서 헤더를 변경하기 위해 수행 한 작업은 다음과 같습니다. 파일의 첫 번째 줄만 편집 한 부분을 생략하고 있습니다. re 라이브러리를 사용하여 파이썬에 쉽게 적응시킬 수 있습니다. 여기에는 숫자 분리도 포함되지만 숫자는 함께 유지됩니다. 줄이나 탭의 시작 부분에 밑줄을 넣지 말라고하는 것보다 쉬웠 기 때문에 두 단계로 수행했습니다.

1 단계 ... 대문자 또는 정수 앞에 소문자를 붙여 밑줄을 붙입니다.

검색:

([a-z]+)([A-Z]|[0-9]+)

바꿔 놓음:

\1_\l\2/

2 단계 ... 위를 취하고 다시 실행하여 모든 대문자를 소문자로 변환하십시오.

검색:

([A-Z])

교체 (백 슬래시, 소문자 L, 백 슬래시) :

\l\1

나는 체인이 필요하다는 것을 제외하고는 같은 문제에 대한 해결책을 찾고 있었다. 예 :

"CamelCamelCamelCase" -> "Camel-camel-camel-case"

멋진 두 단어 솔루션에서 시작하여 다음을 생각해 냈습니다.

"-".join(x.group(1).lower() if x.group(2) is None else x.group(1) \
         for x in re.finditer("((^.[^A-Z]+)|([A-Z][^A-Z]+))", "stringToSplit"))

복잡한 논리의 대부분은 첫 단어를 소문자로 표시하지 않는 것입니다. 첫 단어를 바꾸지 않아도되는 간단한 버전이 있습니다.

"-".join(x.group(1).lower() for x in re.finditer("(^[^A-Z]+|[A-Z][^A-Z]+)", "stringToSplit"))

물론 다른 솔루션에서 설명한대로 정규식을 미리 컴파일하거나 하이픈 대신 밑줄로 조인 할 수 있습니다.

정규 표현식없이 간결하지만 HTTPResponseCode => httpresponse_code :

def from_camel(name):
    """
    ThisIsCamelCase ==> this_is_camel_case
    """
    name = name.replace("_", "")
    _cas = lambda _x : [_i.isupper() for _i in _x]
    seq = zip(_cas(name[1:-1]), _cas(name[2:]))
    ss = [_x + 1 for _x, (_i, _j) in enumerate(seq) if (_i, _j) == (False, True)]
    return "".join([ch + "_" if _x in ss else ch for _x, ch in numerate(name.lower())])

라이브러리없이 :

def camelify(out):
    return (''.join(["_"+x.lower() if i<len(out)-1 and x.isupper() and out[i+1].islower()
         else x.lower()+"_" if i<len(out)-1 and x.islower() and out[i+1].isupper()
         else x.lower() for i,x in enumerate(list(out))])).lstrip('_').replace('__','_')

조금 무겁지만

CamelCamelCamelCase ->  camel_camel_camel_case
HTTPRequest         ->  http_request
GetHTTPRequest      ->  get_http_request
getHTTPRequest      ->  get_http_request

이 사이트에서 제안 된 아주 멋진 RegEx :

(?<!^)(?=[A-Z])

파이썬에 String Split 메소드가 있다면 작동해야합니다 ...

자바에서 :

String s = "loremIpsum";
words = s.split("(?&#60;!^)(?=[A-Z])");

이 간단한 방법으로 작업을 수행해야합니다.

import re

def convert(name):
    return re.sub(r'([A-Z]*)([A-Z][a-z]+)', lambda x: (x.group(1) + '_' if x.group(1) else '') + x.group(2) + '_', name).rstrip('_').lower()

우리는 대문자 (또는 0)로 시작하고 뒤에 소문자가 오는 대문자를 찾습니다.
밑줄은 그룹에서 찾은 마지막 대문자가 나타나기 직전에 배치되며 다른 대문자가 앞에 오는 경우 대문자 앞에 놓일 수 있습니다.
밑줄이 있으면 제거하십시오.
마지막으로 전체 결과 문자열이 소문자로 변경됩니다.

( 여기 에서 가져온 온라인 작업 예제 참조 )

def convert(name):
    return reduce(
        lambda x, y: x + ('_' if y.isupper() else '') + y, 
        name
    ).lower()

그리고 이미 낙타가 아닌 입력으로 사건을 처리해야하는 경우 :

def convert(name):
    return reduce(
        lambda x, y: x + ('_' if y.isupper() and not x.endswith('_') else '') + y, 
        name
    ).lower()

누군가가 완전한 소스 파일을 변환 해야하는 경우를 대비하여 스크립트를 작성하십시오.

# Copy and paste your camel case code in the string below
camelCaseCode ="""
    cv2.Matx33d ComputeZoomMatrix(const cv2.Point2d & zoomCenter, double zoomRatio)
    {
      auto mat = cv2.Matx33d::eye();
      mat(0, 0) = zoomRatio;
      mat(1, 1) = zoomRatio;
      mat(0, 2) = zoomCenter.x * (1. - zoomRatio);
      mat(1, 2) = zoomCenter.y * (1. - zoomRatio);
      return mat;
    }
"""

import re
def snake_case(name):
    s1 = re.sub('(.)([A-Z][a-z]+)', r'\1_\2', name)
    return re.sub('([a-z0-9])([A-Z])', r'\1_\2', s1).lower()

def lines(str):
    return str.split("\n")

def unlines(lst):
    return "\n".join(lst)

def words(str):
    return str.split(" ")

def unwords(lst):
    return " ".join(lst)

def map_partial(function):
    return lambda values : [  function(v) for v in values]

import functools
def compose(*functions):
    return functools.reduce(lambda f, g: lambda x: f(g(x)), functions, lambda x: x)

snake_case_code = compose(
    unlines ,
    map_partial(unwords),
    map_partial(map_partial(snake_case)),
    map_partial(words),
    lines
)
print(snake_case_code(camelCaseCode))

나는 이것으로 꽤 운이 좋았습니다.

import re
def camelcase_to_underscore(s):
    return re.sub(r'(^|[a-z])([A-Z])',
                  lambda m: '_'.join([i.lower() for i in m.groups() if i]),
                  s)

이것은 분명히 속도에 대한 최적화 할 수있는 작은 당신이 원하는 경우 약간.

import re

CC2US_RE = re.compile(r'(^|[a-z])([A-Z])')

def _replace(match):
    return '_'.join([i.lower() for i in match.groups() if i])

def camelcase_to_underscores(s):
    return CC2US_RE.sub(_replace, s)

def convert(camel_str):
    temp_list = []
    for letter in camel_str:
        if letter.islower():
            temp_list.append(letter)
        else:
            temp_list.append('_')
            temp_list.append(letter)
    result = "".join(temp_list)
    return result.lower()

사용 : str.capitalize()변수 str에 포함 된 문자열의 첫 문자를 대문자로 변환하고 전체 문자열을 반환합니다.

예 : 명령 : "hello".capitalize () 출력 : Hello

참고 URL : https://stackoverflow.com/questions/1175208/elegant-python-function-to-convert-camelcase-to-snake-case

'development' 카테고리의 다른 글

n 번째 문자마다 문자열을 분할 하시겠습니까? (0)	2020.03.04
점 (.)이있는 Spring MVC @PathVariable이 잘립니다. (0)	2020.03.04
두 날짜를 비교하는 방법? (0)	2020.03.03
Tomcat 7에서 war 파일을 배포하는 방법 (0)	2020.03.03
특정 파일 만 숨기는 방법? (0)	2020.03.03

현재글CamelCase를 snake_case로 변환하는 우아한 Python 함수?

big-blog

CamelCase를 snake_case로 변환하는 우아한 Python 함수?

CamelCase를 snake_case로 변환하는 우아한 Python 함수?

'development' 카테고리의 다른 글

'development'의 다른글

티스토리툴바

CamelCase를 snake_case로 변환하는 우아한 Python 함수?

CamelCase를 snake_case로 변환하는 우아한 Python 함수?

'development' 카테고리의 다른 글

'development'의 다른글

관련글

티스토리툴바