我已經(jīng)為此苦苦掙扎了幾天,我想知道也許有人可以幫助我。我想要完成的是處理一個包含一組問題和答案的文本文件。文件的內(nèi)容(.doc 或 .docx)如下所示:Document Name1. Question one:a. Answer one to question oneb. Answer two to question onec. Answer three to question one2. Question two:a. Answer one to question twoc. Answer two to question twoe. Answer three to question two到目前為止我嘗試過的是:通過 Apache POI 讀取文檔內(nèi)容,如下所示:fis = new FileInputStream(new File(FilePath));XWPFDocument doc = new XWPFDocument(fis);XWPFWordExtractor extract = new XWPFWordExtractor(doc);String extractorText = extract.getText();所以,到目前為止,我已經(jīng)掌握了文檔的內(nèi)容。接下來,我嘗試創(chuàng)建一個正則表達(dá)式模式,該模式將匹配問題開頭的數(shù)字和點(diǎn)(1. , 12.)并繼續(xù)直到它與冒號匹配:Pattern regexPattern = Pattern.compile("^(\\d|\\d\\d)+\\.[^:]+:\\s*$", Pattern.MULTILINE);Matcher regexMatcher = regexPattern.matcher(extractorText);但是,當(dāng)我嘗試遍歷結(jié)果集時,我找不到任何問題文本:while (regexMatcher.find()) { System.out.println("Found"); for (int i = 0; i < regexMatcher.groupCount() - 2; i += 2) { map.put(regexMatcher.group(i + 1), regexMatcher.group(i + 2)); System.out.println("#" + regexMatcher.group(i + 1) + " >> " + regexMatcher.group(i + 2)); }}由于我是 Java 新手,我不確定我哪里出錯了,希望有人能幫助我。此外,如果有人有更好的方法來創(chuàng)建帶有問題和相關(guān)答案的地圖,我們將不勝感激。先感謝您。編輯:我正在嘗試獲取類似 Map 的內(nèi)容,其中將包含鍵(問題文本)和另一個字符串列表,這些字符串將表示與該問題相關(guān)的一組答案,例如:Map<String, List<String>> desiredResult = new HashMap<>(); desiredResult.entrySet().forEach((entry) -> { String questionText = entry.getKey(); List<String> answersList = entry.getValue(); System.out.println("Now at question: " + questionText); answersList.forEach((answerText) -> { System.out.println("Now at answer: " + answerText); }); });這將生成以下輸出:Now at question: 1. Question one:Now at answer: a. Answer one to question oneNow at answer: b. Answer two to question oneNow at answer: c. Answer three to question one
自定義 Java 正則表達(dá)式:匹配開頭和結(jié)尾
拉風(fēng)的咖菲貓
2021-05-31 16:52:04