std::regex_token_iterator

在標頭檔案 `<regex>` 中定義
template< class BidirIt, class CharT = typename std::iterator_traits<BidirIt>::value_type, class Traits = std::regex_traits<CharT> > class regex_token_iterator		(C++11 起)

std::regex_token_iterator 是一個只讀的 LegacyForwardIterator，它訪問底層字元序列中正則表示式每次匹配的單個子匹配。它也可以用於訪問序列中未被給定正則表示式匹配的部分（例如作為分詞器）。

構造時，它構造一個 std::regex_iterator，並且每次遞增時，它會遍歷當前 match_results 中請求的子匹配，當遞增離開最後一個子匹配時，會遞增底層的 std::regex_iterator。

預設構造的 std::regex_token_iterator 是序列末尾迭代器。當一個有效的 std::regex_token_iterator 在到達最後一個匹配的最後一個子匹配後遞增時，它將變得與序列末尾迭代器相等。進一步解引用或遞增它會導致未定義行為。

在成為序列末尾迭代器之前，如果索引 -1（未匹配片段）出現在請求的子匹配索引列表中，則 std::regex_token_iterator 可能會成為字尾迭代器。如果解引用此類迭代器，它將返回一個 match_results，對應於最後一個匹配與序列末尾之間的字元序列。

std::regex_token_iterator 的典型實現包含底層的 std::regex_iterator，一個包含請求子匹配索引的容器（例如 std::vector<int>），一個等於子匹配索引的內部計數器，一個指向 std::sub_match 的指標，指向當前匹配的當前子匹配，以及一個包含最後一個未匹配字元序列（在分詞器模式下使用）的 std::match_results 物件。

在標頭檔案 `<regex>` 中定義
型別	定義
`std::cregex_token_iterator`	std::regex_token_iterator<const char*>
`std::wcregex_token_iterator`	std::regex_token_iterator<const wchar_t*>
`std::sregex_token_iterator`	std::regex_token_iterator<std::string::const_iterator>
`std::wsregex_token_iterator`	std::regex_token_iterator<std::wstring::const_iterator>

[編輯] 成員型別

成員型別	定義
`value_type`	std::sub_match<BidirIt>
`difference_type`	std::ptrdiff_t
`pointer`	const value_type*
`reference`	const value_type&
`iterator_category`	std::forward_iterator_tag
`iterator_concept` (C++20 起)	std::input_iterator_tag
`regex_type`	std::basic_regex<CharT, Traits>

[編輯] 成員函式

(建構函式)	構造一個新的 `regex_token_iterator` (public member function) [編輯]
(解構函式) (隱式宣告)	銷燬一個 `regex_token_iterator`，包括快取值 (public member function) [編輯]
operator=	賦值內容 (public member function) [編輯]
operator==operator!= (在 C++20 中移除)	比較兩個 `regex_token_iterator` (public member function) [編輯]
operator*operator->	訪問當前子匹配 (public member function) [編輯]
operator++operator++(int)	將迭代器推進到下一個子匹配 (public member function) [編輯]

[編輯] 注意

程式設計師有責任確保傳遞給迭代器建構函式的 std::basic_regex 物件在迭代器生命週期內保持有效。由於迭代器儲存一個 std::regex_iterator，該迭代器儲存一個指向正則表示式的指標，因此在正則表示式被銷燬後遞增迭代器會導致未定義行為。

[編輯] 示例

執行此程式碼

#include <algorithm>
#include <fstream>
#include <iostream>
#include <iterator>
#include <regex>
 
int main()
{
    // Tokenization (non-matched fragments)
    // Note that regex is matched only two times; when the third value is obtained
    // the iterator is a suffix iterator.
    const std::string text = "Quick brown fox.";
    const std::regex ws_re("\\s+"); // whitespace
    std::copy(std::sregex_token_iterator(text.begin(), text.end(), ws_re, -1),
              std::sregex_token_iterator(),
              std::ostream_iterator<std::string>(std::cout, "\n"));
 
    std::cout << '\n';
 
    // Iterating the first submatches
    const std::string html = R"(<p><a href="http://google.com">google</a> )"
                             R"(< a HREF ="http://cppreference.tw">cppreference</a>\n</p>)";
    const std::regex url_re(R"!!(<\s*A\s+[^>]*href\s*=\s*"([^"]*)")!!", std::regex::icase);
    std::copy(std::sregex_token_iterator(html.begin(), html.end(), url_re, 1),
              std::sregex_token_iterator(),
              std::ostream_iterator<std::string>(std::cout, "\n"));
}

輸出

Quick
brown
fox.
 
http://google.com
https://cppreference.tw

[編輯] 缺陷報告

下列更改行為的缺陷報告追溯地應用於以前出版的 C++ 標準。

缺陷報告	應用於	釋出時的行為	正確的行為
LWG 3698 (P2770R0)	C++20	`regex_token_iterator` 曾是 `forward_iterator` 同時它也是一個隱藏迭代器	改為 `input_iterator`^[1]

↑ iterator_category 沒有因決議而改變，因為將其更改為 std::input_iterator_tag 可能會破壞太多現有程式碼。

[1] iterator_category 沒有因決議而改變，因為將其更改為 std::input_iterator_tag 可能會破壞太多現有程式碼。

[1]

編譯器支援
自由（freestanding）與宿主（hosted）
語言
標準庫
標準庫標頭檔案
具名要求
特性測試宏 (C++20)
語言支援庫
概念庫 (C++20)
診斷庫
記憶體管理庫
超程式設計庫 (C++11)
通用工具庫
容器庫
迭代器庫
範圍庫 (C++20)
演算法庫
字串庫
文字處理庫
數值庫
日期和時間庫
輸入/輸出庫
檔案系統庫 (C++17)
併發支援庫 (C++11)
執行控制庫 (C++26)
技術規範
符號索引
外部庫

類
basic_regex (C++11)
sub_match (C++11)
match_results (C++11)
演算法
regex_match (C++11)
regex_search (C++11)
regex_replace (C++11)
迭代器
regex_iterator (C++11)
regex_token_iterator (C++11)
異常
regex_error (C++11)
特性
regex_traits (C++11)
常量
syntax_option_type (C++11)
match_flag_type (C++11)
error_type (C++11)
正則表示式語法
修改後的 ECMAScript-262 (C++11)

成員函式
regex_token_iterator::regex_token_iterator
regex_token_iterator::operator=
比較
regex_token_iterator::operator==regex_token_iterator::operator!= (直到 C++20)
觀察器
regex_token_iterator::operator*regex_token_iterator::operator->
修改器
regex_token_iterator::operator++regex_token_iterator::operator++(int)

cppreference.com

名稱空間

變體

檢視

操作