| Site Info: | Favorites: | C++: | Fun: | Newer Stuff: | Old Fun: | Old Tech: | Old Other: | Crummy: |
| Index | Links | libnuwen | Image Hacking | SF Reviews | Origami Polyhedra | bwtzip | Archived News | Space |
| News: 2008 | Webcomics | MinGW Distro | Modern C++ | Paper Airplane | Random Work | Quotations | Internet | |
| News: 2007 | Rating System | Stephan T. Lavavej | Deus Ex | PNG | Book Reviews | Programming | ||
| Culture | Anime/SF | C++ | Downloads | C Intro | ||||
| News: 2006 | Foundation | Mersenne Primes | Wallpaper | Games | ||||
| News: 2005 | Parrises Squares | Diet | Bits & Bytes | |||||
| News: 2004 | ||||||||
| News: 2003 |
| Download: | Basics: | Details: | Types: | Algorithms: | Compression: | I/O: | Sequences: | Support: | System: | Time: |
| Current Version | Purpose | Portability | typedef.hh | algorithm.hh | arith.hh | cgi.hh | gluon.hh | external_begin.hh | compiler.hh | clock.hh |
| Old Versions | Feedback | Requirements | random.hh | bwt.hh | color.hh | string.hh | external_end.hh | daemon.hh | time.hh | |
| Versioning | Organization | sha256.hh | bzip2.hh | file.hh | vector.hh | test.hh | memory.hh | |||
| Pronunciation | Files | fibonacci.hh | serial.hh | traits.hh | priority.hh | |||||
| Test Cases | huff.hh | socket.hh | ||||||||
| FAQs | jpeg.hh | |||||||||
| mtf.hh | ||||||||||
| zle.hh | ||||||||||
| zlib.hh |
libnuwen-2.0.0.1.tar.bz2 (59 KB) : The complete header-only library, including test cases, Makefile, and ChangeLog.
libnuwen is distributed under the Boost Software License, Version 1.0.
libnuwen is my personal library of modern C++. Among other things, it contains thoroughly tested code for data compression, container manipulation, file I/O, hashing, networking, serialization, timing, date/time formatting, CGI, and console text coloring.
In my personal programming, I frequently need the same things again and again: fixed-length types, sequence concatenation, etc. I avoid writing the same code repeatedly by implementing in libnuwen anything that can be encapsulated well. This also allows me to subject my code to extensive testing. While I write libnuwen for myself, it is distributed under the Boost Software License in the hopes that others may find it to be useful.
If you find libnuwen to be useful, or if you encounter problems with it, please tell me at stl@nuwen.net . I welcome bug reports and enhancement requests.
libnuwen version numbers consist of four parts: major.minor.gizmo.patch.
libnuwen is always stable: there is no development branch, and each new version supersedes all earlier versions.
libnuwen is pronounced "lihb-noo-when". The first syllable is pronounced like "liberty" and not like "library". libnuwen is always written in lowercase.
libnuwen is developed with my MinGW Distro and MSVC 8.0 SP1 on Windows, and tested with GCC 4.1.1 on GNU/Linux. It compiles cleanly at the highest warning levels, excluding useless warnings. It attempts to be completely system-agnostic, aside from assuming 8-bit bytes and two's complement signed numbers. (Headers which do inherently system-dependent things achieve this through conditional compilation.) libnuwen is fully endian-agnostic, and should be reasonably 64-bit clean. (I have never compiled libnuwen on a 64-bit system, so some minor issues exist.)
When libnuwen encodes unsigned integers into bytes, and decodes bytes into unsigned integers, it always uses big-endian representation. This is done independently of the machine's representation.
All libnuwen headers require a modern C++ compiler and a recent version of Boost. Specific headers, which are not used by other libnuwen headers, require other libraries: bzip2.hh requires libbzip2, jpeg.hh requires libjpeg, and zlib.hh requires zlib.
My MinGW Distro satisfies all of these requirements. If you use MSVC 8.0 SP1, I have already built the required headers and libraries. (Because I know exactly what I'm doing, I place the headers into C:\Program Files\Microsoft Visual Studio 8\VC\PlatformSDK\Include and the libraries into C:\Program Files\Microsoft Visual Studio 8\VC\PlatformSDK\Lib . You may want to place them elsewhere, which would involve appropriately configuring your header and library search paths.) If you use GCC on GNU/Linux, you will have to build all of the requirements yourself.
libnuwen is a header-only library requiring no configuration. To use a part of libnuwen, simply include the relevant header. Some libnuwen headers depend on separately compiled libraries that you will have to link against. Forgetting this will lead to "obvious" linker errors.
Public components (i.e. for use by client code) live in namespace nuwen and the few public macros begin with NUWEN_ . Internal components (i.e. invisible to client code) live in namespace pham and internal macros begin with PHAM_ .
Naturally, libnuwen headers are idempotent (multiple inclusion is not harmful), order-agnostic (each header includes everything that it needs from other libnuwen headers, other libraries such as Boost, and the Standard Library), and namespace-clean (libnuwen headers will not pollute the global namespace). Furthermore, client code may say using namespace nuwen; without implying using namespace boost; or using namespace std; .
Internally, each libnuwen header is organized to present a readable interface. Function declarations, class definitions, constants, and so forth appear in a namespace nuwen block as high as possible. This block (excluding the private sections of classes) is intended to be read by clients. The libnuwen implementation consists of everything below this block.
*.hh - Public headers; each header is a libnuwen module.*_private.hh - Internal headers.*.cc - Test cases. They also serve as usage examples, although that is not their primary purpose.changelog.txt - ChangeLog history since version 1.0.0.0.LICENSE_1_0.txt - The Boost Software License, Version 1.0.Makefile - GNU make Makefile for MinGW GCC, GNU/Linux GCC, and MSVC.The Makefile builds the test cases, which are used to ensure that the headers are functioning properly. You can use the headers without building the test cases, but if you are interested in building them, this is how it works:
make builds all test cases.make clean deletes all test case executables as well as other garbage that may be left behind by interrupted builds.make poison scans libnuwen for certain constructs that I want to avoid. You don't need to run this.make huff will build huff_test.exe , and similarly for every other test case.The Makefile contains magic machinery that allows it to handle new header/test case pairs without being modified. However, the Makefile does have to contain special rules for those test cases that are linked against separately compiled libraries.
The Makefile detects your operating system (Windows or GNU/Linux) automatically. To build test cases on Windows with MSVC instead of GCC, replace make with make MSVC=y . Also, make MSVC=y DEBUG=y will build test cases in debug mode. This subjects them to powerful iterator debugging checks, as well as allowing the debugger to step through the executables. (No debug mode is provided for GCC.) These are variables, not targets, so make MSVC=y huff and make MSVC=y DEBUG=y huff will build a single test case with MSVC in release or debug mode.
Maximum warnings are always used when building test cases. Maximum optimizations are always used, except in MSVC debug mode, which uses no optimizations.
How can I use libnuwen sockets to communicate with an HTTP server?
You can't.
libnuwen sockets currently abstract a model in which both the client and the server are powered by libnuwen. They do not attempt to be a general abstraction of networking in modern C++. Instead, they simply wrap a handful of the functions provided by BSD sockets (and even this takes a significant amount of work).
This is the most important libnuwen header. It provides the fixed-length integer typedefs and the sequence container typedefs which libnuwen uses everywhere else.
Note: In the interest of clarity, I will present each header's public components in a namespace nuwen block, taken nearly verbatim (eliding the private sections of classes) from the header itself. In the interest of brevity, unqualified names in prose documentation should be assumed to come from namespace nuwen .
namespace nuwen {
// Fixed-length integer typedefs:
typedef boost::uint8_t uc_t;
typedef boost::int8_t sc_t;
typedef boost::uint16_t us_t;
typedef boost::int16_t ss_t;
typedef boost::uint32_t ul_t;
typedef boost::int32_t sl_t;
typedef boost::uint64_t ull_t;
typedef boost::int64_t sll_t;
// 231 sequence container typedefs of the form:
// [vdl](uc|sc|us|ss|ul|sl|ull|sll|b|c|s)(|_i|_ci|_d|_s|_ri|_cri)_t
// Two examples:
// typedef std::vector<uc_t> vuc_t;
// typedef std::deque<std::string>::const_iterator ds_ci_t;
}
|
The _t suffix means "typedef"; it is an ancient tradition followed by std::size_t .
The fixed-length integer typedefs begin with u for unsigned and s for signed . Following that, they are named after char , short , long , and long long , which are respectively guaranteed to be at least 8, 16, 32, and 64 bits.
The sequence container typedefs are systematically stamped out with macros. The most basic typedef abbreviates std::vector<uc_t> to vuc_t . This is the system:
v for std::vector .d for std::deque .l for std::list .b for bool .c for char .s for std::string ._t for the sequence container itself._i_t for iterator ._ci_t for const_iterator ._d_t for difference_type ._s_t for size_type ._ri_t for reverse_iterator ._cri_t for const_reverse_iterator .This terse system may seem unusual, but consider whether std::vector<boost::uint32_t>::const_iterator is more readable than nuwen::vul_ci_t .
Don't confuse this system (of using terse typedefs as abbreviations for lengthy type names) with Hungarian notation, the evil practice of encoding type information into the names of variables. Hungarian notation is evil for many reasons, one of which is that putting type information (whether highly abbreviated or not) into variable names provides no actual benefit while degrading the readability of those variable names. Abbreviating types themselves can enhance readability, as long as a consistent and simple system is used, and various other evils are avoided; for example, the practices of forming constant and pointer typedefs. Readability is severely harmed by embedding const and * into typedefs, with the special exceptions of constant iterators, pointers to functions, pointers to member functions, and pointers to member variables.
vuc_t is the most frequently used sequence container typedef in libnuwen interfaces. Byte sequences are special. They represent arbitrary binary data ("a chunk of memory"), and they can be compressed and decompressed, stored in files, and sent over the network. Other objects must be serialized to (and deserialized from) byte sequences before they can undergo such operations.
This header provides one algorithm which is missing from the STL, as well as STL algorithm wrappers that take sequence containers or pointers to member functions.
namespace nuwen {
template <typename InIt, typename OutIt, typename Pred> OutIt copy_if(InIt first, InIt last, OutIt result, Pred pred);
template <typename C> C& universal_sort(C& container);
template <typename T> std::list<T>& universal_sort(std::list<T>& l);
template <typename C, typename F> C& universal_sort(C& container, F comparator);
template <typename T, typename F> std::list<T>& universal_sort(std::list<T>& l, F comparator);
template <typename C> C universal_sort_copy(const C& container);
template <typename C, typename F> C universal_sort_copy(const C& container, F comparator);
template <typename C> C& universal_stable_sort(C& container);
template <typename T> std::list<T>& universal_stable_sort(std::list<T>& l);
template <typename C, typename F> C& universal_stable_sort(C& container, F comparator);
template <typename T, typename F> std::list<T>& universal_stable_sort(std::list<T>& l, F comparator);
template <typename C> C universal_stable_sort_copy(const C& container);
template <typename C, typename F> C universal_stable_sort_copy(const C& container, F comparator);
template <typename C, typename T> C& nuke ( C& container, const T& value);
template <typename C, typename T> C nuke_copy(const C& container, const T& value);
template <typename C, typename P> C& nuke_if ( C& container, P pred);
template <typename C, typename P> C nuke_copy_if(const C& container, P pred);
template <typename C> C& nuke_dupes(C& container);
template <typename C, typename F, typename B> C& nuke_dupes(C& container, F comparator, B binary_pred);
template <typename C> C nuke_dupes_copy(const C& container);
template <typename C, typename F, typename B> C nuke_dupes_copy(const C& container, F comparator, B binary_pred);
template <typename C> C& stable_nuke_dupes(C& container);
template <typename C, typename F, typename B> C& stable_nuke_dupes(C& container, F comparator, B binary_pred);
template <typename C> C stable_nuke_dupes_copy(const C& container);
template <typename C, typename F, typename B> C stable_nuke_dupes_copy(const C& container, F comparator, B binary_pred);
template <typename InIt, typename OutIt, typename MemR, typename MemT>
OutIt copy_if(InIt first, InIt last, OutIt result, MemR (MemT::* pred)() const);
template <typename C, typename MemR, typename MemT>
C& nuke_if(C& container, MemR (MemT::* pred)() const);
template <typename C, typename MemR, typename MemT>
C nuke_copy_if(const C& container, MemR (MemT::* pred)() const);
template <typename InIt, typename MemR, typename MemT>
void for_each(InIt first, InIt last, MemR (MemT::* fxn)());
template <typename InIt, typename MemR, typename MemT>
InIt find_if(InIt first, InIt last, MemR (MemT::* pred)() const);
template <typename InIt, typename MemR, typename MemT>
typename std::iterator_traits<InIt>::difference_type
count_if(InIt first, InIt last, MemR (MemT::* pred)() const);
template <typename InIt, typename OutIt, typename MemR, typename MemT>
OutIt transform(InIt first, InIt last, OutIt result, MemR (MemT::* op)() const);
template <typename FwdIt, typename MemR, typename MemT, typename T>
void replace_if(FwdIt first, FwdIt last, MemR (MemT::* pred)() const, const T& new_value);
template <typename It, typename OutIt, typename MemR, typename MemT, typename T>
OutIt replace_copy_if(It first, It last, OutIt result, MemR (MemT::* pred)() const, const T& new_value);
template <typename BiIt, typename MemR, typename MemT>
BiIt partition(BiIt first, BiIt last, MemR (MemT::* pred)() const);
template <typename BiIt, typename MemR, typename MemT>
BiIt stable_partition(BiIt first, BiIt last, MemR (MemT::* pred)() const);
}
|
copy_if() should be in the Standard Library but isn't.
universal_sort() works on all Standard sequence containers. All combinations of comparator, copying, and stable versions are provided.
nuke() is a sequence container-based encapsulation of the erase-remove idiom. All combinations of copying and predicate versions are provided. (When using the erase-remove idiom, it is easy to forget to provide the second iterator to erase. There is a form of erase that takes a single iterator, so such a mistake will not be caught at compile time. These wrappers make that mistake impossible to commit.)
In the interest of generality, nuke_dupes() is provided, which encapsulates the sort-unique-erase idiom. All combinations of comparator/binary predicate, copying, and stable versions are provided.
The wrappers which take pointers to member functions convert them into functors with boost::mem_fn .
This header provides functions for arithmetic encoding and decoding with an adaptive model.
namespace nuwen {
inline vuc_t arith(const vuc_t& v);
inline vuc_t unarith(const vuc_t& v);
}
|
Although adaptive arithmetic coding can be used by itself for compression, it is not very desirable to do so. The coder is neither blazingly fast nor blazingly efficient. Applications which desire extreme speed should use Huffman coding, while applications which desire extreme efficiency should use the full BWT/MTF-2/ZLE/Arith pipeline.
This header provides functions for the Burrows-Wheeler Transformation and its inverse.
namespace nuwen {
inline vuc_t bwt(const vuc_t& v);
inline vuc_t unbwt(const vuc_t& v);
}
|
bwt() is implemented using Ukkonen's linear time algorithm for suffix tree construction.
This header provides wrappers around libbzip2.
namespace nuwen {
inline vuc_t bzip2(const vuc_t& v);
inline vuc_t unbzip2(const vuc_t& v);
}
|
bzip2() uses maximum compression. The format that bzip2() generates and that unbzip2() consumes is identical to the format used by .bz2 files.
This header provides code for Common Gateway Interface programs.
namespace nuwen {
inline std::string x_www_form_urlencoded_from_str(const std::string& s);
inline std::string str_from_x_www_form_urlencoded(const std::string& s);
inline std::string get_env_var(const char * name);
inline std::string get_content();
namespace cgi {
class request {
public:
inline explicit request(const std::string& s);
inline ser::serial& serialize(ser::serial& s) const;
inline explicit request(des::deserial& d);
inline std::string operator[](const std::string& field) const;
inline vs_t all_values_for(const std::string& field) const;
};
}
inline std::map<std::string, std::string> read_cookie(const std::string& s);
// If m is empty, a std::logic_error will be thrown.
// If a string longer than 4096 characters is generated, a std::runtime_error will be thrown.
// If not persistent, it is a session cookie. The expiration date and domain are ignored.
// If persistent, an expiration date is necessary. A domain such as ".nuwen.net" is optional.
// By default, when the domain is "", cookies are sent back to only the originating server.
inline std::string make_cookie(const std::map<std::string, std::string>& m, bool persistent = false,
sll_t absolute_expiration_date = current_qh_time() + 3600, const std::string& domain = "");
}
|
x_www_form_urlencoded_from_str() and str_from_x_www_form_urlencoded() perform application/x-www-form-urlencoded encoding and decoding as specified in RFC 1866.
get_env_var() returns an environment variable's value as a std::string . It returns an empty string if the environment variable does not exist. get_content() returns the data passed to the CGI program, whether it was sent via GET or POST. Although POST data comes through the standard input, get_content() uses static variables so it can be called more than once.
The string returned by get_content() can be used to construct a nuwen::cgi::request , which will parse the string into fields and values. (It can also be serialized.) Its operator[]() returns the value of a given field. If there are multiple values, it returns one of them, while if there are no values, it returns an empty string. The more powerful but less convenient all_values_for() allows you to handle multiple and nonexistent values.
read_cookie() parses a cookie into a map between names and values.
make_cookie() returns a string, the cookie, which is to be emitted with the headers of the page that the CGI program returns. You must provide a map containing the data that will be stored in the cookie.
This header provides code for high-precision timing, as well as sleeping with no CPU consumption.
namespace nuwen {
inline ull_t clock_freq();
inline ull_t clock_ctr();
inline void snooze(double s);
namespace chrono {
class watch {
public:
inline watch();
inline void reset();
inline double seconds() const;
};
}
}
|
clock_freq() returns how many ticks of the high-precision clock pass during a single second. clock_ctr() returns the current high-precision clock counter. Generally, this achieves microsecond precision.
These functions are further wrapped by nuwen::chrono::watch ; when one is constructed, it records the high-precision clock counter. You can later use seconds() to determine how much time has passed since the object's construction. You can also reset() the stored counter.
snooze() makes the program sleep for a given length of time, measured in seconds, without spinning the processor.
This header provides functions for console text coloring.
namespace nuwen {
inline void set_fore_color(bool bold, bool red, bool green, bool blue);
inline void set_back_color(bool bold, bool red, bool green, bool blue);
inline void reset_color();
// Manipulators, like std::endl.
// black bg_black
// darkblue bg_darkblue
// darkgreen bg_darkgreen
// darkcyan bg_darkcyan
// darkred bg_darkred
// darkpurple bg_darkpurple
// darkyellow bg_darkyellow
// gray bg_gray
// darkgray bg_darkgray
// blue bg_blue
// green bg_green
// cyan bg_cyan
// red bg_red
// purple bg_purple
// yellow bg_yellow
// white bg_white
}
|
reset_color() sets the foreground color to gray and the background color to black.
This header provides macros that identify the version of libnuwen being used, as well as the platform and compiler being used.
// Platform: These macros are nicer names for other macros.
#define NUWEN_PLATFORM_GCC // Defined if __GNUC__ is defined.
#define NUWEN_PLATFORM_MINGW // Defined if __MINGW32__ is defined.
#define NUWEN_PLATFORM_MSVC // Defined if _MSC_VER is defined.
#define NUWEN_PLATFORM_WINDOWS // Defined if _WIN32 is defined.
#define NUWEN_PLATFORM_UNIX // Defined if _WIN32 is NOT defined.
// Libnuwen Version: These macros are decimal literals. For example:
#define NUWEN_MAJOR_VERSION 2
#define NUWEN_MINOR_VERSION 0
#define NUWEN_GIZMO_VERSION 0
#define NUWEN_PATCH_VERSION 0
// Libnuwen Version: This macro is a string literal. For example:
#define NUWEN_VERSION "2.0.0.0"
// Compiler Name: This macro is a string literal. It can be any one of these:
#define NUWEN_COMPILER_NAME "GCC"
#define NUWEN_COMPILER_NAME "MinGW GCC"
#define NUWEN_COMPILER_NAME "Microsoft Visual Studio"
// Compiler Version: This macro is a string literal. It can look like this:
#define NUWEN_COMPILER_VERSION "4.1.2" // For GCC, including MinGW.
#define NUWEN_COMPILER_VERSION "2005 (8.0)" // For MSVC.
// The Kitchen Sink: This macro is a string literal.
#define NUWEN_COMPILER \
"This program was compiled on " __DATE__ " at " __TIME__ "\n" \
"by " NUWEN_COMPILER_NAME " " NUWEN_COMPILER_VERSION " using libnuwen " NUWEN_VERSION "."
// For example:
// This program was compiled on Feb 21 2007 at 05:58:23
// by MinGW GCC 4.1.2 using libnuwen 2.0.0.0.
|
Generally, libnuwen headers include each other in unspecified ways (just like the Standard headers do). However, compiler.hh is guaranteed to be included by every public libnuwen header except external_begin.hh and external_end.hh .
This header provides daemonize() , which makes a program run as a daemon.
namespace nuwen {
inline void daemonize();
}
|
Note that on Windows, running as a daemon is accomplished in a very different manner. Look at the Makefile to see how daemon_test.exe is compiled with MinGW and MSVC.
This header is special. When libnuwen headers include non-libnuwen headers ("external" headers, like the Standard or Boost headers), this header is used in the following manner:
#include "external_begin.hh"
#include <algorithm>
#include <utility>
#include <boost/lexical_cast.hpp>
#include <boost/shared_ptr.hpp>
#include "external_end.hh"
|
On Windows, this deals with several messy things about the Windows API headers. (To summarize, both WINVER and _WIN32_WINNT have to be defined high enough for libnuwen to access certain functionality. Also, NOMINMAX and WIN32_LEAN_AND_MEAN have to be defined so that the Windows API headers don't conflict with other headers.)
Furthermore, on MSVC, this temporarily disables certain warnings that are triggered by non-libnuwen headers.
As a user of libnuwen, you are not required to use this header. However, the macros that affect the Windows API headers must be defined appropriately, or your programs may fail to compile. You can define these macros project-wide, ensure that libnuwen headers always come first (so they can define the macros before the Windows API headers are included; order dependencies are evil, but this isn't my fault), or use this header.
This header forms a pair with external_begin.hh .
This header provides functions for Fibonacci encoding and decoding.
namespace nuwen {
template <typename C> vuc_t fibonacci_encode(const C& c);
template <typename T> std::vector<T> fibonacci_decode(const vuc_t& v);
}
|
Fibonacci coding compactly represents unsigned integers that are biased towards being small. However, zeros cannot be represented.
This header provides code for reading from and writing to binary files.
namespace nuwen {
namespace file {
enum output_mode {
create,
overwrite,
append
};
inline vuc_t read_file(const std::string& filename);
inline void write_file(const vuc_t& v, const std::string& filename, output_mode mode = create);
inline void remove_file(const std::string& filename);
class input_file {
public:
inline explicit input_file(const std::string& filename);
inline vuc_t read_at_most(vuc_s_t n);
inline vuc_t read_rest();
inline void close();
};
class output_file {
public:
inline explicit output_file(const std::string& filename, output_mode mode = create);
inline void write(const vuc_t& v);
inline void close();
};
}
}
|
nuwen::file::input_file and nuwen::file::output_file are copyable RAII resource managers. They have shared semantics, so a copy of an object will still refer to the same file.
This header provides code for terse and efficient sequence concatenation.
namespace nuwen {
template <typename T, typename X> unspecified-gluon-of-T-type glu(const X& x);
template <typename T> unspecified-gluon-of-T-type cat(const std::vector<T>& x);
template <typename T> unspecified-gluon-of-T-type cat(const std::deque<T>& x);
template <typename T> unspecified-gluon-of-T-type cat(const std::list<T>& x);
template <typename T, std::size_t N> unspecified-gluon-of-T-type cat(const T (&x)[N]);
}
|
A gluon-of-T, so named because it "glues" things together into a sequence of T, is a lightweight temporary object. It lives in an internal namespace (specifically, namespace pham::qcd ). The function templates glu<T>() (also short for "glue") and cat() (short for "concatenate") can be used to create a gluon-of-T. The former explicitly specifies T, while the latter (provided for convenience) deduces T. The gluon-of-T then has an operator()() which can be used to create another gluon-of-T that refers to the first gluon-of-T. In this manner, a singly linked list of temporary objects can be built up.
Finally, the function templates vec() , deq() , and lst() (which live in namespace pham::qcd and are found through argument-dependent lookup) can be used to create a sequence out of the things that have been glued together with these temporary objects. Or, they can be appended to another sequence with operator+=() (also found through argument-dependent lookup).
Individual elements as well as sequences of T and arrays of T can be glued together. When an object is given to glu<T>() or the operator()() , if it is a sequence of T or array of T, it will be recognized as such. Otherwise, it is assumed to be an individual element, which can be of type T or another type which can be used to construct a T. (Note how std::vector<std::string>::push_back() accepts both std::string and const char * . Gluons work the same way.)
glu<T>() explicitly specifies T, and so must be used when the first thing being glued is an element. (For technical reasons, the type T must be known when the first gluon is created, although the type of the sequence that will eventually be created or appended to does not have to be known until the end. T cannot be deduced from a single element; if you begin with "foobar" , you might be creating a sequence of std::string or you might be creating a sequence of const char * .) When the first thing being glued is a sequence of T or an array of T, glu<T>() can also be used, but is unnecessarily verbose. cat() will deduce T for you.
Suppose x , y , and z are of type vuc_t . Then, vec(glu<uc_t>(x)(y)(z)) appends the three vectors into a single vuc_t and is equivalent to vec(cat(x)(y)(z)) . vec(glu<vuc_t>(x)(y)(z)) creates a std::vector<vuc_t> with three elements.
Here are usage examples:
// 1. Glue elements together to create a vector.
const std::vector<int> a = vec(nuwen::glu<int>(1)(3)(5)(7));
// a contains: 1, 3, 5, 7.
// 2. Glue elements together to create a deque.
const std::deque<int> b = deq(nuwen::glu<int>(2)(4)(6)(8));
// b contains: 2, 4, 6, 8.
// 3. Glue elements together to create a list.
const std::list<int> c = lst(nuwen::glu<int>(1)(4)(9)(16));
// c contains: 1, 4, 9, 16.
// 4. Concatenate a vector, deque, list, and array.
const int array[] = { 10, 20, 30 };
const std::vector<int> d = vec(nuwen::cat(a)(b)(c)(array)(55));
// d contains: 1, 3, 5, 7, 2, 4, 6, 8, 1, 4, 9, 16, 10, 20, 30, 55.
// 5. Create a temporary vector and give it to a function of the form:
// void foobar(const std::vector<int>& v);
foobar(vec(nuwen::glu<int>(66)(77)(88)));
// 6. Append things to another sequence.
std::vector<int> e(3, 456);
e += nuwen::glu<int>(25)(a)(50);
// e contains: 456, 456, 456, 25, 1, 3, 5, 7, 50.
// 7. Glue elements of different types together to create a vector.
const std::string s("cute");
const std::string t("kittens");
const nuwen::vs_t f = vec(nuwen::glu<std::string>(s)("fluffy")(t));
// f contains: "cute", "fluffy", "kittens".
// 8. Concatenate several copies of a vector.
const std::vector<int> g = vec(nuwen::cat(a)(a)(a));
// g contains: 1, 3, 5, 7, 1, 3, 5, 7, 1, 3, 5, 7.
// 9. Glue several copies of a vector to create a vector-of-vectors.
const std::vector<std::vector<int> > h = vec(nuwen::glu<std::vector<int> >(a)(a)(a));
// h.size() is 3, h[0] == a, h[1] == a, h[2] == a.
|
This header provides functions for Huffman compression and decompression.
namespace nuwen {
inline vuc_t huff(const vuc_t& v);
inline vuc_t puff(const vuc_t& v);
}
|
They are both implemented with deterministic finite automata, making them very fast by avoiding bitwise operations. They use canonical codes, which incur a fixed overhead of 256 bytes.
This header provides a wrapper around libjpeg.
namespace nuwen {
inline boost::tuple<ul_t, ul_t, vuc_t> decompress_jpeg_insecurely(const vuc_t& input, ul_t max_width = 2048, ul_t max_height = 2048);
}
|
Note that "insecurely" refers to libjpeg's error handling policy of exiting upon fatal errors. Therefore, this function should not be used with untrusted data. This function can be used by any program which controls its JPEG resources, such as a game.
The returned tuple contains the image's width, height, and RGB data.
This header provides code for determining the current memory usage of the current process.
namespace nuwen {
inline ull_t vm_bytes();
}
|
vm_bytes() returns how many bytes of the current process's virtual address space are currently being used.
This header provides functions for the Move To Front-2 transformation and its inverse.
namespace nuwen {
inline void mtf2(vuc_t& v);
inline void unmtf2(vuc_t& v);
}
|
MTF-2 is the second stage of Burrows-Wheeler compression, after the BWT itself.
This header provides code to set the priority of the current process to "idle".
namespace nuwen {
inline void set_priority_idle();
}
|
This header provides a wrapper around Boost's Mersenne Twister implementation.
namespace nuwen {
namespace random {
class twister : public boost::noncopyable {
public:
inline twister();
inline uc_t random_uc();
inline us_t random_us();
inline ul_t random_ul();
inline ull_t random_ull();
inline float random_float_0_1();
inline double random_double_0_1();
};
}
}
|
nuwen::random::twister seeds itself upon construction.
This header provides code for lightweight serialization.
namespace nuwen {
namespace ser {
class serial : public boost::noncopyable {
public:
inline serial();
template <typename T> explicit serial(const T& t);
inline vuc_t vuc() const;
template <typename T> serial& operator<<(const T& t);
inline serial& operator<<(bool b);
inline serial& operator<<(uc_t n);
inline serial& operator<<(sc_t n);
inline serial& operator<<(us_t n);
inline serial& operator<<(ss_t n);
inline serial& operator<<(ul_t n);
inline serial& operator<<(sl_t n);
inline serial& operator<<(ull_t n);
inline serial& operator<<(sll_t n);
inline serial& operator<<(const std::string& s);
template <typename T> serial& operator<<(const std::vector<T>& v);
template <typename T> serial& operator<<(const std::deque<T>& d);
template <typename T> serial& operator<<(const std::list<T>& l);
template <typename A, typename B> serial& operator<<(const std::pair<A, B>& p);
template <typename T, typename L> serial& operator<<(const std::set<T, L>& s);
template <typename T, typename L> serial& operator<<(const std::multiset<T, L>& ms);
template <typename K, typename V, typename L> serial& operator<<(const std::map<K, V, L>& m);
template <typename K, typename V, typename L> serial& operator<<(const std::multimap<K, V, L>& mm);
};
}
namespace des {
class deserial : public boost::noncopyable {
public:
inline explicit deserial(const vuc_t& v);
template <typename T> T get();
inline vuc_ci_t curr() const;
inline vuc_ci_t end() const;
inline ull_t size() const;
inline void consume(ull_t n);
};
}
}
|
Serialization is the process by which arbitrarily complex objects are represented as a sequence of bytes, which can then be stored in a file or transmitted over the network, and later reconstructed into living objects again.
A nuwen::ser::serial can be default-constructed to be empty, or constructed from an object to serialize that object. To serialize additional objects, use its operator<<() , which can be chained. Use vuc() to obtain a vuc_t representation of all of the objects that you have serialized.
A nuwen::des::deserial is constructed from a vuc_t containing serialized data. The nuwen::des::deserial object does not make a copy of this data; it simply maintains iterators into the vuc_t , which must remain valid while you deserialize objects.
Given a nuwen::des::deserial named d , to deserialize an object of type T, use d.get<T>() .
By default, bool , the fixed-length integer types, and std::string are serializable. Pairs, sequences, and associative containers of serializable types are also serializable. To make a user-defined type serializable, provide a serialize() method that takes and returns nuwen::ser::serial& . To make a user-defined type deserializable, provide an explicit constructor from nuwen::des::deserial& . Although you can use the curr() , end() , size() , and consume() methods of nuwen::des::deserial , usually you just need to deserialize each member of your object in order.
Two warnings: First, only fixed-length integers should be serialized. There's no way to prevent variable-length integers from being serialized, so it's up to you to avoid doing this. The serialized representation that libnuwen uses is completely portable, but there's no helping you if you serialize 4 bytes on one machine and expect to read out 8 bytes on another machine.
Second, it's up to you to ensure that deserialization happens in the correct order with the correct types. The libnuwen deserializers throw exceptions when insufficient bytes are available, or when the serialized data is clearly inconsistent with the type that you're trying to extract (e.g. trying to deserialize a bool from a byte other than 0 or 1), but otherwise perform no validation.
This header provides an implementation of SHA-256.
namespace nuwen {
inline vuc_t sha256(const vuc_t& v);
}
|
This implementation has been tested with the FIPS 180-2 examples as well as the 129 NIST test vectors and 100 Monte Carlo samples.
This header provides wrappers around Berkeley Sockets/Winsock 2.
namespace nuwen {
inline void socket_startup();
namespace sock {
const ull_t DEFAULT_TIMEOUT_MS = 30000;
const ul_t DEFAULT_LIMIT = 10485760;
class client_socket {
public:
inline client_socket(const std::string& server, us_t port);
inline vuc_t read(ul_t limit = DEFAULT_LIMIT);
inline void write(const vuc_t& v);
};
class client_id {
public:
client_id();
bool operator==(const client_id& other) const;
bool operator!=(const client_id& other) const;
bool operator< (const client_id& other) const;
};
class server_socket : public boost::noncopyable {
public:
inline explicit server_socket(us_t port,
ull_t timeout_ms = DEFAULT_TIMEOUT_MS,
ul_t receive_limit = DEFAULT_LIMIT,
ul_t send_limit = DEFAULT_LIMIT);
inline std::pair<client_id, vuc_t> next_request();
inline void flush();
inline void write_continue(client_id id, const vuc_t& v);
inline void write_finish (client_id id, const vuc_t& v);
inline void finish(client_id id);
};
}
}
|
nuwen::sock::client_socket and nuwen::sock::server_socket implement a semi-synchronous multiple-client/single-server model.
socket_startup() must be called a single time before constructing any sockets. It also registers shutdown code to be called when the program exits.
A nuwen::sock::client_socket , which is a copyable RAII resource manager with shared semantics, is constructed with the server's hostname (or IP) and the port that the server listens on. Once connected, it can synchronously send a vuc_t to the server or receive a vuc_t from the server. The content of such requests and responses is up to you, as well as the pattern in which the client and server talk to each other. (You clearly don't want them to get confused and deadlock on waiting to read data from each other.)
nuwen::sock::client_socket::read() takes a limit ; if the server attempts to send more bytes than this in a single response, an exception is thrown.
A nuwen::sock::server_socket is a noncopyable RAII resource manager. It is an abstraction of ::select() , capable of maintaining multiple connections simultaneously. It is constructed with a port to listen on, as well as three limits. timeout_ms specifies how many milliseconds a connection may remain open without generating a request. receive_limit specifies the maximum number of bytes the server will attempt to receive from a connection in a single request. send_limit specifies the maximum number of bytes that may pile up waiting to be sent over a single connection. Offending connections are summarily killed.
When next_request() is called, a lot of work happens. It blocks until a request from a client is received, at which point the request and a client ID is returned. Each client is given a unique client ID, which is used to label any requests from that client. Client IDs are also used to direct responses to clients. (Client IDs are never reused; if a client closes its connection to the server and opens another one, the new connection gets a new client ID. Also, don't play with fire; client IDs are not unique between multiple servers.)
While blocking, new connections are being opened, requests are materializing, responses are dematerializing, and dead connections are being closed. The server therefore processes each request synchronously, but sends and receives data asynchronously.
flush() blocks until all waiting data has been sent to clients. It does a subset of the work that next_request() does: it sends data and closes dead connections, but does not open new connections and receive data.
write_continue() queues up data to be sent to a given client. The data isn't actually sent until either next_request() or flush() is called.
finish() marks a given connection to be closed after all data has been sent to it.
write_finish() performs write_continue() followed by finish() .
This header provides functions for string manipulation.
namespace nuwen {
inline std::string strip_outside_whitespace(const std::string& s);
inline std::string& find_and_replace_mutate(std::string& s, const std::string& old, const std::string& nu);
inline std::string find_and_replace_copy(const std::string& s, const std::string& old, const std::string& nu);
inline std::string upper(const std::string& s);
inline std::string lower(const std::string& s);
inline std::string comma_from_ull(ull_t n);
inline std::string comma_from_sll(sll_t n);
}
|
comma_from_ull() converts an unsigned integer into a comma-separated string like "12,345", and comma_from_sll() does the same for signed integers. They can also be used with types smaller than 64 bits.
This header provides the macro NUWEN_TEST , which is used by the libnuwen test cases. The macro takes two arguments: an ID and an expression.
The ID is a string literal or std::string which uniquely identifies the test being performed. (The test will fail if it uses an ID that has already been used.)
The expression is the thing being tested, which should return true . If it returns false , throws an exception, or exits unexpectedly, the test will fail.
If all tests succeed, a message is printed in green. If a test fails, a message is printed in red with the test's ID and the failure mode. Then std::exit(EXIT_FAILURE) is immediately called; subsequent tests are not carried out.
This is a macro instead of a function because it has to catch any exceptions being emitted by the test. It is best to perform all work inside the NUWEN_TEST macro (using a helper function if necessary, as many libnuwen test cases do), instead of performing work outside and testing only the final result.
Invocations of NUWEN_TEST should not be nested.
This header provides functions for date/time formatting.
namespace nuwen {
// libnuwen uses Qeng Ho (QH) time, which is defined as the number of
// seconds since Neil Armstrong first stepped onto the lunar surface.
// Why use yet another epoch? Its main competitors are UTC and TAI. The
// wretched monstrosity that is UTC isn't even continuous, being
// repeatedly polluted with leap seconds. Worse, UTC between 1961 and
// 1972 didn't even run at one UTC second per SI second. Worst of all,
// UTC wasn't defined before 1961. TAI is somewhat better, being
// continuous. However, it is connected to the vile Common Era, and we
// can't have that.
// Unix time is beyond evil.
// QH time has the following advantages:
// 1. It is continuous, making arithmetic easy.
// 2. It is defined arbitrarily far into the past and into the future.
// 3. It is always stored in a sll_t, avoiding vile wrapping problems.
// 4. It is readily explainable to anyone anywhere, as it is not
// connected to the vile Common Era.
// Of course, no one else speaks QH time yet, so libnuwen must go to
// some length in order to obtain the current QH time, convert QH to
// UTC and back, and format QH time into something locally
// comprehensible.
inline sl_t tai_minus_utc_at_utc(const boost::posix_time::ptime& utc);
inline sll_t utc_minus(const boost::posix_time::ptime& utc_l, const boost::posix_time::ptime& utc_r);
inline std::pair<boost::posix_time::ptime, bool> utc_plus(boost::posix_time::ptime utc, sll_t n);
inline sll_t qh_from_utc(const boost::posix_time::ptime& utc);
inline std::pair<boost::posix_time::ptime, bool> utc_from_qh(sll_t qh);
inline sll_t current_qh_time();
// Qeng Ho Time Format
// YY / YYYY (Y)ear (evil) / (good) 06 / 2006
// tt / TT DS(T) (lower) / (upper) st/dt / ST/DT
// M / MM (M)onth (bare) / (padded) 1-12 / 01-12
// MMM / MMMM (M)onth (abbrev) / (full) Jan / January
// D / DD (D)ay (bare) / (padded) 1-31 / 01-31
// o / O Day (O)rdinal (lower) / (upper) th / TH
// DDD / DDDD (D)ay (abbrev) / (full) Mon / Monday
// a / A (A)M/PM (lower) / (upper) a/p / A/P
// aa / AA (A)M/PM (lower) / (upper) am/pm / AM/PM
// H / HH (H)our-24 (bare) / (padded) 0-23 / 00-23
// h / hh (H)our-12 (bare) / (padded) 1-12 / 01-12
// m / mm (M)inute (bare) / (padded) 0-59 / 00-59
// s / ss (S)econd (bare) / (padded) 0-60 / 00-60
// q / qq (Q)eng Ho (solid) / (comma) 1035424059 / 1,035,424,059
// r / R (R)oman Year (lower) / (upper) mmvi / MMVI
// All alphabetic characters are reserved, while all other characters are literal.
// Surround a sequence of characters with stars *like this* to make it literal.
// An empty pair of stars will be replaced with a single star.
inline std::string format_qh(sll_t qh, int hours_offset, bool use_dst, const std::string& fmt);
}
|
These functions properly recognize both leap seconds and DST, including the rule change from 2006 to 2007, although no attempt is made to handle other rules.
tai_minus_utc_at_utc() returns TAI - UTC at a given UTC time. For example, during all of 2006, TAI - UTC was 33. (That is, TAI was 33 seconds ahead of UTC.)
utc_minus() subtracts two UTC times, taking into account leap seconds.
utc_plus() adds a given number of seconds to a given UTC time, returning the result as a UTC time and a bool indicating whether the result falls on a leap second (which cannot be represented by boost::posix_time::ptime ).
qh_from_utc() converts a given UTC time to QH time.
utc_from_qh() converts a given QH time to UTC time (and a leap second bool ).
current_qh_time() returns the current QH time.
format_qh() formats a given QH time into a string. hours_offset is the time offset that you want to use in the absence of DST. use_dst indicates whether you want to apply the DST rules. (Whether a DST correction actually occurs is determined by the time being formatted.) For example, regardless of the time being formatted, you would use -8 and true for Pacific Time, while you would use 0 and false for UTC.
My favorite format is "[M/D/YYYY DDD h:mm.ss AA]" , which generates strings like "[2/3/2006 Fri 4:05.06 AM]" .
(By the way, Qeng Ho is pronounced "Cheng Ho".)
This header provides type traits which are used by other libnuwen headers.
namespace nuwen {
template <typename X> struct is_sequence { static const bool value = false; };
template <typename T> struct is_sequence<std::vector<T> > { static const bool value = true; };
template <typename T> struct is_sequence<std::deque<T> > { static const bool value = true; };
template <typename T> struct is_sequence<std::list<T> > { static const bool value = true; };
template <typename X, typename T> struct is_sequence_of { static const bool value = false; };
template <typename T> struct is_sequence_of<std::vector<T>, T> { static const bool value = true; };
template <typename T> struct is_sequence_of<std::deque<T>, T> { static const bool value = true; };
template <typename T> struct is_sequence_of<std::list<T>, T> { static const bool value = true; };
template <typename X> struct is_string {
static const bool value = boost::is_same<X, std::string>::value;
};
template <typename X> struct is_stringlike_sequence {
static const bool value =
is_sequence_of<X, char>::value
|| is_sequence_of<X, uc_t>::value
|| is_sequence_of<X, sc_t>::value;
};
template <typename X, typename T> struct is_bounded_array_of { static const bool value = false; };
template <typename T, std::size_t N> struct is_bounded_array_of< T[N], T> { static const bool value = true; };
template <typename T, std::size_t N> struct is_bounded_array_of<const T[N], T> { static const bool value = true; };
}
|
This header provides code for vector and byte stuff, vaguely speaking.
namespace nuwen {
namespace pack {
class packed_bits {
public:
inline packed_bits();
inline void push_back(bool bit);
inline vuc_t vuc() const;
};
}
inline ull_t bytes_from_bits(ull_t n);
inline bool bit_from_vuc(const vuc_t& v, ull_t n);
template <typename DstCont, typename SrcCont> DstCont sequence_cast(const SrcCont& s);
template <typename DstCont, typename SrcCont>
typename boost::disable_if<boost::mpl::or_<boost::is_array<SrcCont>, boost::is_pointer<SrcCont> >, DstCont>::type
string_cast(const SrcCont& s);
template <typename DstCont> DstCont string_cast(const char * c);
inline uc_t uc_from_sc ( sc_t x);
inline us_t us_from_ss ( ss_t x);
inline ul_t ul_from_sl ( sl_t x);
inline ull_t ull_from_sll(sll_t x);
inline sc_t sc_from_uc ( uc_t x);
inline ss_t ss_from_us ( us_t x);
inline sl_t sl_from_ul ( ul_t x);
inline sll_t sll_from_ull(ull_t x);
inline us_t us_from_vuc(const vuc_t& v, vuc_s_t n = 0);
inline ul_t ul_from_vuc(const vuc_t& v, vuc_s_t n = 0);
inline ull_t ull_from_vuc(const vuc_t& v, vuc_s_t n = 0);
inline us_t us_from_vuc(vuc_ci_t i, vuc_ci_t end);
inline ul_t ul_from_vuc(vuc_ci_t i, vuc_ci_t end);
inline ull_t ull_from_vuc(vuc_ci_t i, vuc_ci_t end);
inline vuc_t vuc_from_us ( us_t x);
inline vuc_t vuc_from_ul ( ul_t x);
inline vuc_t vuc_from_ull(ull_t x);
inline std::string hex_from_uc(uc_t c);
inline std::string hex_from_vuc(const vuc_t& v);
inline uc_t uc_from_hex(const std::string& s, std::string::size_type pos = 0);
inline vuc_t vuc_from_hex(const std::string& s);
}
|
nuwen::pack::packed_bits allows you to insert bits one at a time and then obtain a vuc_t (which is zero-padded if the bits don't end on a byte boundary).
bytes_from_bits() converts a number of bits into a number of bytes, counting partial bytes. For example, 16 bits is 2 bytes, while 17 bits is 3 bytes.
bit_from_vuc() reads the Nth bit of a vuc_t .
sequence_cast<DstCont>() converts one sequence into another sequence with the same value_type .
string_cast<DstCont>() converts a C string or std::string into a stringlike sequence, or a stringlike sequence into a std::string . (A stringlike sequence is a sequence of char , uc_t , or sc_t .)
The family of uc_from_sc() functions converts signed integers into unsigned integers losslessly, while the family of sc_from_uc() functions converts them back.
The family of us_from_vuc() functions reads unsigned 16-bit, 32-bit, and 64-bit integers from a vuc_t .
The family of vuc_from_us() functions converts unsigned 16-bit, 32-bit, and 64-bit integers into a vuc_t .
hex_from_uc() converts a single byte into a hexadecimal string, while hex_from_vuc() converts an entire vuc_t into a hexadecimal string. Both emit uppercase.
uc_from_hex() reads a single byte out of a hexadecimal string, while vuc_from_hex() converts an entire hexadecimal string into a vuc_t . Both handle lowercase and uppercase.
This header provides functions for the Zero Length Encoding transformation and its inverse.
namespace nuwen {
inline vuc_t zle(const vuc_t& v);
inline vuc_t unzle(const vuc_t& v);
}
|
ZLE is the third stage of Burrows-Wheeler compression, after the BWT and MTF-2 transformations.
This header provides wrappers around zlib.
namespace nuwen {
inline vuc_t zlib(const vuc_t& v, int level = Z_BEST_COMPRESSION, int strategy = Z_DEFAULT_STRATEGY);
inline vuc_t unzlib(const vuc_t& v);
}
|
zlib exposes several tunable parameters. zlib() allows you to tune both the compression level and strategy. It tunes two other parameters to increase compression and speed at the expense of memory usage.
libnuwen-2.0.0.0.tar.bz2 (59 KB)
libnuwen-1.0.28.0.tar.bz2 (43 KB)
libnuwen-1.0.27.2.tar.bz2 (42 KB)
libnuwen-1.0.27.1.tar.bz2 (42 KB)
libnuwen-1.0.27.0.tar.bz2 (42 KB)
libnuwen-1.0.26.0.tar.bz2 (42 KB)
http://nuwen.net/libnuwen.html

stl@nuwen.net
Updated 7/1/2007.