Files CCore/inc/Scanf.h CCore/src/Scanf.cpp
Subfolders CCore/inc/scanf CCore/src/scanf
Scanning is a process, inverse to the printing. I.e. scanning is a creation of an object state from a sequence of characters. There are three actors in a scanning process: the object, the input device and scanning options. The input device provides a character sequence, scanning options determines how the object state is encoded. The type of the object is responsible for the scanning implementation. There are default implementations for integral and string types.
Scanning is performed with the following scanning functions:
template <class S,class ... TT> void Scanf(S &&inp,const char *format,TT && ... tt); template <class S,class ... TT> void Scanobj(S &&inp,TT && ... tt);
Scanf() uses the given format string to scan. Format string may specify options for each object to scan.
Scanobj() scans the input to the given set of objects with default scan options.
Format string contains embedded format specifiers. Each character not in a format specifier causes Scanf() to read a single character from the input and match it with the format character. If there is no input characters or characters are different then the scanning is failed. The character ' '(space) is treated especially: it extracts from the input device space-like characters up to the end or a non-space character. Format specifier has a form "#<options>;", where <options> is the option string. It is converted to the correspondent object scan options. To scan the character '#' use "##" as a format specifier.
There are three ways to make a type scanable.
The most direct way: define in the class definition the method template scan() like this:
class SomeClass { public: template <class S> void scan(S &inp) { .... } };
If you need scanning options do like this:
class SomeClass { public: struct ScanOptType { .... ScanOptType(); // default options ScanOptType(const char *ptr,const char *lim); // [ptr,lim) -- options from the format string // if the format string has a format specifier #XXX;, // then [ptr,lim) points to the string XXX }; template <class S> void scan(S &inp,const ScanOptType &opt) { .... } };
The second way is to specify a scan proxy type:
class SomeClass { public: using ScanProxyType = .... ; };
In this case the SomeClass::ScanProxyType & must be castable from the type SomeClass. For example:
struct S { int val = 0 ; using SomeClass = int ; operator int & () { return val; } };
You can also specialize the template CCore::ScanProxy<T> to define the scan proxy type and optionally the scan option type.
namespace App { struct S { .... }; } // namespace App namespace CCore { template <> struct ScanProxy<App::S> { using ProxyType = .... ; using OptType = .... ; }; } // namespace CCore
You can determine the scan option type using the ScanOptAdapter<T>. If the type T has a scan option type (even through proxy), then ScanOptAdapter<T>::ScanOptType is that type.
Sometimes you need to specify scan options using the object, not the format string. In such case use the function BindScanOpt():
T obj; Opt opt; Scanf(out,"#;",BindScanOpt(opt,obj));
An input device for scanning provides a sequential access to a character sequence. An input device type must be either an actual input device type or redirect the input functionality to another type using a specialization of the ScanInpAdapter. By default, the actual input device type is declared as the inner type ScanInpType.
/* struct ScanInpAdapter<S> */
template <class S>
struct ScanInpAdapter
{
using ScanInpType = typename S::ScanInpType ;
};
The input device must provide the following methods:
class IDev { public: using ScanInpType = IDev & ; ScanInpType scanRef() { return *this; } // cursor ulen operator + () const; bool operator ! () const; char operator * () const; void operator ++ (); // error bool isOk() const; bool isFailed() const; void fail(); };
It behaves like a character cursor plus internal fail flag is implemented to signal about scanning errors or input errors.
The simplest way of the building input object classes is to derive them from the ScanBase class.
class ScanBase : NoCopy
{
....
private:
virtual PtrLen<const char> underflow()=0;
protected:
void pump(); // must be called in constructor of derived class
void reset();
public:
// constructors
ScanBase();
~ScanBase();
// cursor
ulen operator + () const;
bool operator ! () const;
char operator * () const;
void operator ++ ();
// position
TextPos getTextPos() const;
// error
bool isOk() const;
bool isFailed() const;
void fail();
};
This class counts the current text position. You can retrieve it with the method getTextPos(). A derived class must implement the virtual method underflow(). This method is called when the previous input character frame is consumed. It must return a character range. This range can be empty, if the end of input is reached. It must remains valid until the next call of underflow(). In case of error the method fail() must be called and an exception may be thrown. There are two protected methods to be used in a derived class. pump() performs the character pumping, it must be called in the derived class constructor as the last action. reset() resets the base to the initial state. It can be used in such derived class methods like open() or close() when the input is changed.
To scan a given string the class ScanString exists:
class ScanString : public ScanBase
{
PtrLen<const char> str;
private:
virtual PtrLen<const char> underflow() { return Replace_null(str); }
public:
explicit ScanString(StrLen str_) : str(str_) { pump(); }
~ScanString() {}
};
There is a default implementation of the string scanning. It is applied to objects of the type String. The following scan options are supported:
/* enum StringScanType */ enum StringScanType { StringScanQuote, StringScanToSpace, StringScanToPunct, StringScanDefault = StringScanToPunct }; template <class Dev> bool Parse_try(Dev &dev,StringScanType &ret); /* struct StringScanOpt */ struct StringScanOpt { StringScanType type; void setDefault() { type=StringScanDefault; } StringScanOpt() { setDefault(); } StringScanOpt(const char *ptr,const char *lim); // // [.q|.Q|.s|.S|.p|.P] // };
There are three variants of the string scanning, they are selected by options. Options ".q" and ".Q" means scan a quoted string (StringScanQuote), ".s" and ".S" means scan up to a space character (StringScanToSpace), and ".p" and ".P" means scan up to a space or punctuation character (StringScanToPunct). The last mode is default, punctuation characters are: ! " # $ % & ' ( ) * + , - . / : ; < = > ? @ [ \ ] ^ ` { | } ~.
When scanning a quoted string the input is scanned for a string of characters, enclosed in double quotes. Quotes are not included in the result. If the input does not begin with a quoted string, it is failed.
When scanning up to a some kind of character the input is scanned for a string of characters up to the first character of that kind.
The class StringSetScan scans the input for one of the given strings and returns the string index. Index is 1-based. If the input does not begin with one of the expected strings, the input is failed and the index is set to null.
class StringSetScan : NoCopy
{
....
public:
StringSetScan(std::initializer_list<const char *> zstr_list);
~StringSetScan();
using PrintProxyType = ulen ;
operator ulen() const { return index; }
template <class S>
void scan(S &inp);
};
Constructor takes a braced list of zero-terminated strings. These strings must persist over the object life-time. Usually, they are string literals.
The cast operator returns the 1-based index of the found string or null if the scan has failed.
There is a default implementation of the integral types scanning. It is applied to objects of integral type. The following scan options are supported:
/* enum IntScanBase */ enum IntScanBase { IntScanNone, IntScanHex, IntScanBin, IntScan0X, IntScanDefault = IntScanNone }; template <class Dev> bool Parse_try(Dev &dev,IntScanBase &ret); /* struct IntScanOpt */ struct IntScanOpt { unsigned base; // 2..16, 0 for any IntScanBase scan_base; void setDefault() { base=0; scan_base=IntScanDefault; } IntScanOpt() { setDefault(); } IntScanOpt(const char *ptr,const char *lim); // // [.base|.x|.X|.b|.B|.h|.H] // };
There are four forms of the integer representation.
Binary form:
[+|-]D1..(b|B) , where D is a binary digit
Hex form:
[+|-]D1..(h|H) , where D is a hex digit
0x form:
[+|-]0xD1.. , where D is a hex digit
And base form, where base is the representation base from 2 to 16:
[+|-]D1.. , where D is a base digit
By default the integer scanner accepts all form of the integer representation, base is assumed 10. But using options you may specify a desired form. Use .b or .B to specify the binary form, .h or .H to specify the hex form, .x or .X to specify the 0x form and .base to specify the base form.
There is a family of scan utility functions.
int CharBaseValue(char ch,unsigned base);
This function returns the digit value of the given character or -1, if the character is not a digit, where base is a representation base. It must be in the range [2,16].
template <class S>
void SkipSpace(S &inp);
This function extracts space-like characters from the input up to the end or a first non-space character. Can be applied to any char cursor.
template <class S>
bool ProbeChar(S &inp,char ch);
This function tries to move over the specified char. On success it returns true, otherwise the return value is false and the cursor remains at the current position. Can be applied to any char cursor.
template <class S,class ... CC>
void PassChars(S &inp,CC ... cc);
This function extracts the given sequence of characters or fail the input.
template <class S,class Func>
void PassOneOfChar(S &inp,Func func);
This function extracts one character, which satisfies the given condition, or fail the input. func is a functor with the signature boolable (char ch). It is used to specify the condition.
template <class S,class Func>
void SkipAllOfChar(S &inp,Func func);
This function extracts characters, which satisfies the given condition, up to the end or a first character, which does not satisfy the condition. Can be applied to any char cursor. func is a functor with the signature boolable (char ch). It is used to specify the condition.
enum EndOfScanType
{
EndOfScan
};
This word can be scanned. It fails the scanner if it is not empty.
template <class T,class ProxySet>
struct ScanProxySet : ScanOptAdapter<ProxySet>
{
struct ProxyType
{
T &ret;
explicit ProxyType(T &ret_) : ret(ret_) {}
template <class S>
void scan(S &inp)
{
ProxySet set;
Scanobj(inp,set);
set.map(ret);
}
template <class S,class OptType>
void scan(S &inp,const OptType &opt)
{
ProxySet set;
Scanobj(inp,BindScanOpt(opt,set));
set.map(ret);
}
};
};
This class can be used to specialize the template ScanProxy<T> .
template <> struct ScanProxy<SomeType> : ScanProxySet<SomeType,SetType> {};
The class SetType will be used to scan instead the type SomeType. Then the result is returned using the method map().
class SetType { public: SetType(); void map(SomeType &ret) const; template <class S> void scan(S &inp); }; OR class SetType { public: SetType(); void map(SomeType &ret) const; using ScanOptType = .... ; template <class S> void scan(S &inp,ScanOptType opt); };