3 回答

TA貢獻(xiàn)1868條經(jīng)驗(yàn) 獲得超4個(gè)贊
你可以試試這個(gè)隊(duì)友
^([a-z, \(\)-]*?)?\(?([\d,]+)?\)?\s*?\(?([\d,-]+)?\)?$
解釋
^
- 錨定到字符串的開(kāi)頭。([a-z, \(\)-]+?)?
- 匹配任何字符 a 到 z,或,
or(
或 ')` 或 '-' 零次或多次(懶惰模式)。\(?
- 匹配(
(?
使其成為可選)。([\d,]+)?- 匹配任何數(shù)字或
,
一次或多次。(?
使其成為可選)。\)
- 匹配)
。\s*?
- 匹配空間零次或多次。(?([\d,-]+)?\)?
- 匹配任何數(shù)字或-
。$
- 字符串結(jié)束。

TA貢獻(xiàn)1783條經(jīng)驗(yàn) 獲得超4個(gè)贊
我認(rèn)為這個(gè)正則表達(dá)式會(huì)做你想做的:
^([A-Z][A-Za-z0-9 (),%;-]+?[^(\d\s])? ?(?:(\(?[\d,]+\)?|-)\s+(\(?[\d,]+\)?|-))?$
它查找一組字母字符,以字母開(kāi)頭,可能包括一些[(),%;-],但不以 a (、數(shù)字或空格結(jié)尾,后跟兩組可能()包圍的數(shù)字和,或-。所有組都是可選的,以允許匹配沒(méi)有描述或沒(méi)有數(shù)字的行。
在 Python 中:
import re
data = """LOSS BEFORE INCOME TAXES (900,000) (900,000)
INCOME TAXES (RECOVERED) (90,000) (90,000)
RETAINED EARNINGS - BEGINNING OF YEAR 9,999,999 9,999,999
EXPENSES
Subcontracts 8,058 2,655
Business taxes 116 -
600,000 600,000
GROSS PROFIT (50%; 2016 - 50%) 500,000 500,000
Bad debts - 50
Salaries, wages and benefits 400,000 400,000"""
regex = re.compile('^([A-Z][A-Za-z0-9 (),%;-]+?[^(\d\s])? ?(?:(\(?[\d,]+\)?|-)\s+(\(?[\d,]+\)?|-))?$', re.MULTILINE)
print regex.findall(data)
輸出:
[('LOSS BEFORE INCOME TAXES', '(900,000)', '(900,000)'),
('INCOME TAXES (RECOVERED)', '(90,000)', '(90,000)'),
('RETAINED EARNINGS - BEGINNING OF YEAR', '9,999,999', '9,999,999'),
('EXPENSES', '', ''),
('Subcontracts', '8,058', '2,655'),
('Business taxes', '116', '-'),
('', '600,000', '600,000'),
('GROSS PROFIT (50%; 2016 - 50%)', '500,000', '500,000'),
('Bad debts', '-', '50'),
('Salaries, wages and benefits', '400,000', '400,000')
]

TA貢獻(xiàn)1851條經(jīng)驗(yàn) 獲得超5個(gè)贊
試試下面的正則表達(dá)式
r"([\w ,()-]*)[\(?[\d, -]*\)?]*[\(?[\d, -]*\)?]*"
添加回答
舉報(bào)