Files CCore/inc/gadget/PtrLen.h CCore/src/gadget/PtrLen.cpp
This file contains main tools to work with ranges of objects. Range is a continuous part of an array. There is an object in the range with the lower address, its address is a base address of the range. "Address" means a pointer of the correspondent type. If this address is p, then the range consists of objects with addresses { p, p+1, ..., p+n-1 }, where n is the number of elements of the range. A range can be empty. In this case it has the base address, which may not refer to an existing object, but the number of elements is 0. A range of length 1 can also be formed from a single object. Finally, empty range may have the base address equals null.
Sometimes Range is considered more generally using idiom of iterator, but we focus on the basic type of ranges, because of its practical importance. Basic ranges are the most efficient and convenient way of representation of a set of similar data, so it must be used whenever possible. The main advantages of basic ranges are: best loops, best sequential and random object access, best cache locality.
The best loop in the programming is the loop n-times. This is because such loops can be unrolled, pipelined and optimized for small loop count. So compilers usually try to reorganize every loop into this kind of loop, if possible.
Range...() is four basic functions to work with ranges.
template <class T,class S> void RangeCopy(T *restrict dst,const S *restrict src,ulen count); template <class T> void RangeSet(T *dst,T src,ulen count); template <class T> void RangeSet_null(T *dst,ulen count); template <class T,class S> ulen RangeMatch(const T *dst,const S *src,ulen count);
RangeCopy() does copy one range to disjoint another range with the possible implicit type conversion. It performs assignments *dst=*src;. Here we use the non-standard extension to C++ — keyword restrict. It tells compiler, that dst and src ranges has no common elements, and enables loop optimization, based on this assumption. restrict, in fact, is a macro in the target PlatformBase.h header, so target can define this word according compiler abilities.
RangeSet() sets elements of the given range to the given value. The src argument is not a reference. It performs assignments *dst=src;.
RangeSet_null() sets elements of the given range to the null value T(). It performs assignments *dst=T();.
The interface and implementation of these functions are selected to be more appropriate, when the type T is a Small Data Type. They do they job in the ascending order, but you should not assume any specific order of execution.
RangeMatch() compares elements of two ranges. It returns the number of the first equals pairs. In other words, if the RangeMatch() returned len, then dst[0]==src[0], ..., dst[len-1]==src[len-1], but len==count or dst[len]!=src[len]. To compare elements the operator != is used.
PtrLen is a range representation type. It contains the range base pointer and the range length.
template <class T>
struct PtrLen
{
T *ptr;
ulen len;
PtrLen is widely used in CCore to pass ranges as arguments and to do native operations with ranges. This type has a bunch of methods to do so.
You can use constructors to create a PtrLen, but more often you should use the function Range().
// constructors PtrLen() noexcept : ptr(0),len(0) {} PtrLen(NothingType) : ptr(0),len(0) {} PtrLen(T *ptr_,ulen len_) : ptr(ptr_),len(len_) {} operator PtrLen<const T>() const { return {ptr,len}; }
You can also cast to the const version.
Cursor methods can be used to iterated through range elements. When you use PtrLen as a cursor, it advances in the ascending order. I.e. after (ptr,len) the (ptr+1,len-1) follows.
// cursor ulen operator + () const { return len; } // get length // range is not empty bool operator ! () const { return !len; } // range is empty T & operator * () const { return *ptr; } // first element T * operator -> () const { return ptr; } // pointer to the first element void operator ++ () { ptr++; len--; } // advance void operator -- () { ptr--; len++; } // back PtrLen<T> operator += (ulen delta); // advance on delta PtrLen<T> takeup(ulen delta); // advance up to delta elements T & take(); // take first element and advance PtrLen<T> getFinal() const { return {ptr+len,0}; }
Here is a typical usage:
for(auto r=Range(....); +r ;++r) r->method();
for(auto r=Range(....); +r ;++r) func(*r);
for(auto r=Range(....); +r ;) func(r.take());
for(auto r=Range(....); r.fit(10) ;) { auto prefix=(r+=10); // first 10 element subrange .... }
for(auto r=Range(....);;) { auto prefix=r.takeup(10); // first <=10 element subrange if( !prefix ) break; // r was empty .... }
operator += advances on the given number of elements and returns the range of these elements. You must be sure the number does not exceed the range length.
takeup() is similar, but it checks the range length and advances up to the given number of elements.
Loop
for(auto r=Range(....); +r ;++r) { .... /* (*r) */ }
,in fact, is the same as
{ T *ptr=....; ulen len=....; for(; len ;ptr++,len--) { .... /* (*ptr) */ } }
So it is count-down n-times loop. There is another popular way of a range iteration:
T *ptr=....; T *lim=....; for(; ptr<lim ;ptr++) { .... /* (*ptr) */ }
Compilers, however, often turn this loop into the previous one. It leads to:
T *ptr=....; T *lim=....; ulen len=lim-ptr; // there is a hidden division operation here for(; len ;ptr++,len--) { .... /* (*ptr) */ }
Many embedded CPU has no hardware division. So it leads to an expensive hidden operation, that may kill performance for short tight loops.
Reverse iteration can be done with the method getFinal():
auto r=Range(....); // (ptr,len) for(auto fin=r.getFinal(); fin.len<r.len ;) { --fin; func(fin); // iterate (ptr+len-1,1), (ptr+len-2,2), ..., (ptr,len) }
Methods fit() can be used to check the length of the range.
// fit bool fit(ulen length) const { return length<=len; } bool fit(ulen off,ulen length) const { return off<=len && length<=(len-off) ; }
The first checks is there at least the length elements. The second can be used to check if the subrange with the given off(set) from the begin and length can be selected.
You can select a desired part of the range using parts methods:
// parts PtrLen<T> prefix(ulen length) const; // prefix of the given length PtrLen<T> prefix(PtrLen<T> suffix) const; // prefix, complementary to the given suffix PtrLen<T> suffix(ulen length) const; // suffix of the given length PtrLen<T> part(ulen off,ulen length) const; // inner part with the given off(set) and length PtrLen<T> part(ulen off) const; // suffix with the given off(set) PtrLen<T> inner(ulen off,ulen endoff) const; // inner part with the given off(set) and offset from the end (endoff)
You can pick an element of the range based on its index:
// index access T & operator [] (ulen index) const; T & at(ulen index) const; // with the index guard T & back(ulen index) const; // reverse order, index 1 means the last element
The following set of methods performs basic range operations:
// methods template <class S> void copyTo(S dst[/* len */]) const; template <class S> void copyFrom(const S src[/* len */]) const; void copy(const T src[/* len */]) const; void set(T src) const; void set_null() const; template <class S> ulen match(const S src[/* len */]) const; template <class S> bool equal(const S src[/* len */]) const; template <class S> bool equal(PtrLen<S> src) const; template <class S> bool hasPrefix(PtrLen<S> src) const; template <class S> bool hasSuffix(PtrLen<S> src) const;
The last method is required to support range-based for with the PtrLen:
// begin()/end() support bool operator != (PtrLen<T> end) const { return len!=end.len; }
StrLen is a derived class from the PtrLen<const char>.
struct StrLen : PtrLen<const char>
{
// constructors
StrLen() noexcept {}
StrLen(NothingType) {}
// implicit conversions
StrLen(const char *zstr); // zstr!=0 is a pointer to a zero-terminated string
StrLen(PtrLen<const char> str) : PtrLen<const char>(str) {}
StrLen(PtrLen<char> str) : PtrLen<const char>(str.ptr,str.len) {}
// given string
StrLen(const char *str,ulen len) : PtrLen<const char>(str,len) {}
};
StrLen is widely used in CCore to pass string arguments and to work with strings. It is not assumed, that the string is zero-terminated. Instead, we specify the string length.
StrLen has a several implicit constructors, so it can be silently converted from different string-like types.
The implicit encoding string length with zero termination was a language design mistake. To perform string operations you have to know its length. So each time you will have to calculate it. It may lead to quadratic time algorithms instead of linear time if you are not steady enough. But the most goodless thing is you cannot select a substring from a zero-terminated string: it is not a zero-terminated! So you have to allocate memory and do copy! In CCore we avoid it converting zero-terminated strings into StrLen. So we calculate zero-terminated string length only one time and then use it everywhere.
Range() and Range_const() is a family of Creator functions for the range creation.
template <class T> PtrLen<T> Range(PtrLen<T> a); template <class T,ulen Len> PtrLen<T> Range(T (&buf)[Len]); template <class T> PtrLen<T> Range(T *ptr,ulen len); template <class T> PtrLen<T> Range(T *ptr,T *lim);
And the correspondent const-variants.
template <class T> PtrLen<const T> Range_const(PtrLen<T> a); template <class T,ulen Len> PtrLen<const T> Range_const(T (&buf)[Len]); template <class T> PtrLen<const T> Range_const(T *ptr,ulen len); template <class T> PtrLen<const T> Range_const(T *ptr,T *lim);
You can create a range from a PtrLen, from an array, from a base pointer and length and from a base pointer and a limit pointer. Limit is a pointer to the past-the-end element. It may not refer to an actual object. The advantage of using the function Range() is an implicit type inference. Consider:
int buf[10]; auto r=Range(buf); PtrLen<int> r1(buf,10);
Finally, there exists two generic functions Range() and Range_const():
template <class S> auto Range(S &&src); template <class S> auto Range_const(const S &src);
They are applicable to objects of types with the Range Access Interface:
class C { public: // range access T * getPtr(); // called from Range() const T * getPtr() const; // called from Range() with constant argument const T * getPtr_const() const; // called from Range_const() ulen getLen() const; };
CCore defines in its namespace global functions begin() and end(). They are applicable to objects of types with the Range Access Interface and to PtrLen type. So you can use range-based for:
PtrLen<T> r=....; for(auto obj : r ) { .... }
This loop is essentially the same as:
PtrLen<T> r=....; for(; +r ;++r) { .... } // *r instead of obj
Or:
DynArray<T> data; for(auto obj : data ) { .... }
Instead of:
DynArray<T> data; for(auto r=Range(data); +r ;++r) { .... } // *r instead of obj
Sometimes you need to do something more complicated than merely run through a range. In such cases use range loops instead of range-based for:
DynArray<T> data; auto r=Range(data); if( +r ) { .... // handle the first element for(++r; +r ;++r) { .... // do with the rest } } else { .... // empty case }
Mutate...() is a three functions family to cast a pointer or a range to the byte pointer or the byte range.
template <class T> T * MutatePtr(void *ptr); template <class T> T * MutatePtr(const void *ptr); template <class T,class S> PtrLen<T> Mutate(PtrLen<S> a);
The destined type T must be (const) char or unsigned char. Mutate() creates a PtrLen over all bytes, occupied by elements of the source range. C++ standard allows aliasing of objects with these types. Mutate...() should be used with caution in special cases for the sake of performance or to do some low-level processing.