binary(3)


binary(3erl)               Erlang Module Definition               binary(3erl)

NAME
       binary - Library for handling binary data.

DESCRIPTION
       This module contains functions for manipulating byte-oriented binaries.
       Although the majority of functions could be provided using  bit-syntax,
       the  functions in this library are highly optimized and are expected to
       either execute faster or consume less memory, or both, than a  counter-
       part written in pure Erlang.

       The  module  is provided according to Erlang Enhancement Proposal (EEP)
       31.

   Note:
       The library handles byte-oriented data. For bitstrings that are not bi-
       naries  (does  not  contain whole octets of bits) a badarg exception is
       thrown from any of the functions in this module.

DATA TYPES
       cp()

              Opaque data type representing a compiled search pattern. Guaran-
              teed  to  be  a tuple() to allow programs to distinguish it from
              non-precompiled search patterns.

       part() = {Start :: integer() >= 0, Length :: integer()}

              A representaion of a part (or range) in a  binary.  Start  is  a
              zero-based  offset  into  a binary() and Length is the length of
              that part. As input to functions in this module, a reverse  part
              specification is allowed, constructed with a negative Length, so
              that the part of the binary begins at  Start  +  Length  and  is
              -Length long. This is useful for referencing the last N bytes of
              a binary as {size(Binary), -N}. The functions in this module al-
              ways return part()s with positive Length.

EXPORTS
       at(Subject, Pos) -> byte()

              Types:

                 Subject = binary()
                 Pos = integer() >= 0

              Returns  the byte at position Pos (zero-based) in binary Subject
              as an integer. If Pos >= byte_size(Subject), a badarg  exception
              is raised.

       bin_to_list(Subject) -> [byte()]

              Types:

                 Subject = binary()

              Same as bin_to_list(Subject, {0,byte_size(Subject)}).

       bin_to_list(Subject, PosLen) -> [byte()]

              Types:

                 Subject = binary()
                 PosLen = part()

              Converts  Subject  to  a  list of byte()s, each representing the
              value of one byte. part() denotes which part of the binary()  to
              convert.

              Example:

              1> binary:bin_to_list(<<"erlang">>, {1,3}).
              "rla"
              %% or [114,108,97] in list notation.

              If PosLen in any way references outside the binary, a badarg ex-
              ception is raised.

       bin_to_list(Subject, Pos, Len) -> [byte()]

              Types:

                 Subject = binary()
                 Pos = integer() >= 0
                 Len = integer()

              Same as bin_to_list(Subject, {Pos, Len}).

       compile_pattern(Pattern) -> cp()

              Types:

                 Pattern = binary() | [binary()]

              Builds an internal structure representing  a  compilation  of  a
              search   pattern,   later  to  be  used  in  functions  match/3,
              matches/3, split/3, or replace/4. The cp() returned  is  guaran-
              teed  to  be  a tuple() to allow programs to distinguish it from
              non-precompiled search patterns.

              When a list of binaries is specified, it denotes a set of alter-
              native  binaries  to  search  for.  For  example,  if  [<<"func-
              tional">>,<<"programming">>] is specified as Pattern, this means
              either  <<"functional">> or <<"programming">>". The pattern is a
              set of alternatives; when only a single binary is specified, the
              set has only one element. The order of alternatives in a pattern
              is not significant.

              The list of binaries used for search alternatives must  be  flat
              and proper.

              If  Pattern  is  not  a binary or a flat proper list of binaries
              with length > 0, a badarg exception is raised.

       copy(Subject) -> binary()

              Types:

                 Subject = binary()

              Same as copy(Subject, 1).

       copy(Subject, N) -> binary()

              Types:

                 Subject = binary()
                 N = integer() >= 0

              Creates a binary with the content of Subject duplicated N times.

              This function always creates a new binary, even if N = 1. By us-
              ing copy/1 on a binary referencing a larger binary, one can free
              up the larger binary for garbage collection.

          Note:
              By deliberately copying a single binary to avoid  referencing  a
              larger  binary, one can, instead of freeing up the larger binary
              for later garbage collection, create much more binary data  than
              needed.  Sharing  binary  data  is usually good. Only in special
              cases, when small parts reference large binaries and  the  large
              binaries  are  no longer used in any process, deliberate copying
              can be a good idea.

              If N < 0, a badarg exception is raised.

       decode_unsigned(Subject) -> Unsigned

              Types:

                 Subject = binary()
                 Unsigned = integer() >= 0

              Same as decode_unsigned(Subject, big).

       decode_unsigned(Subject, Endianness) -> Unsigned

              Types:

                 Subject = binary()
                 Endianness = big | little
                 Unsigned = integer() >= 0

              Converts the binary digit representation, in big endian or  lit-
              tle  endian, of a positive integer in Subject to an Erlang inte-
              ger().

              Example:

              1> binary:decode_unsigned(<<169,138,199>>,big).
              11111111

       encode_unsigned(Unsigned) -> binary()

              Types:

                 Unsigned = integer() >= 0

              Same as encode_unsigned(Unsigned, big).

       encode_unsigned(Unsigned, Endianness) -> binary()

              Types:

                 Unsigned = integer() >= 0
                 Endianness = big | little

              Converts a positive integer to the smallest possible representa-
              tion in a binary digit representation, either big endian or lit-
              tle endian.

              Example:

              1> binary:encode_unsigned(11111111, big).
              <<169,138,199>>

       first(Subject) -> byte()

              Types:

                 Subject = binary()

              Returns the first byte of binary Subject as an integer.  If  the
              size of Subject is zero, a badarg exception is raised.

       last(Subject) -> byte()

              Types:

                 Subject = binary()

              Returns  the  last  byte of binary Subject as an integer. If the
              size of Subject is zero, a badarg exception is raised.

       list_to_bin(ByteList) -> binary()

              Types:

                 ByteList = iolist()

              Works exactly as erlang:list_to_binary/1,  added  for  complete-
              ness.

       longest_common_prefix(Binaries) -> integer() >= 0

              Types:

                 Binaries = [binary()]

              Returns  the length of the longest common prefix of the binaries
              in list Binaries.

              Example:

              1> binary:longest_common_prefix([<<"erlang">>, <<"ergonomy">>]).
              2
              2> binary:longest_common_prefix([<<"erlang">>, <<"perl">>]).
              0

              If Binaries is not a flat list of binaries, a  badarg  exception
              is raised.

       longest_common_suffix(Binaries) -> integer() >= 0

              Types:

                 Binaries = [binary()]

              Returns  the length of the longest common suffix of the binaries
              in list Binaries.

              Example:

              1> binary:longest_common_suffix([<<"erlang">>, <<"fang">>]).
              3
              2> binary:longest_common_suffix([<<"erlang">>, <<"perl">>]).
              0

              If Binaries is not a flat list of binaries, a  badarg  exception
              is raised.

       match(Subject, Pattern) -> Found | nomatch

              Types:

                 Subject = binary()
                 Pattern = binary() | [binary()] | cp()
                 Found = part()

              Same as match(Subject, Pattern, []).

       match(Subject, Pattern, Options) -> Found | nomatch

              Types:

                 Subject = binary()
                 Pattern = binary() | [binary()] | cp()
                 Found = part()
                 Options = [Option]
                 Option = {scope, part()}
                 part() = {Start :: integer() >= 0, Length :: integer()}

              Searches  for the first occurrence of Pattern in Subject and re-
              turns the position and length.

              The function returns {Pos, Length} for the  binary  in  Pattern,
              starting at the lowest position in Subject.

              Example:

              1> binary:match(<<"abcde">>, [<<"bcde">>, <<"cd">>],[]).
              {1,4}

              Even  though  <<"cd">> ends before <<"bcde">>, <<"bcde">> begins
              first and is therefore  the  first  match.  If  two  overlapping
              matches begin at the same position, the longest is returned.

              Summary of the options:

                {scope, {Start, Length}}:
                  Only  the  specified  part  is searched. Return values still
                  have offsets from  the  beginning  of  Subject.  A  negative
                  Length is allowed as described in section Data Types in this
                  manual.

              If none of the strings in Pattern is found, the atom nomatch  is
              returned.

              For a description of Pattern, see function compile_pattern/1.

              If {scope, {Start,Length}} is specified in the options such that
              Start > size of Subject, Start + Length < 0 or Start + Length  >
              size of Subject, a badarg exception is raised.

       matches(Subject, Pattern) -> Found

              Types:

                 Subject = binary()
                 Pattern = binary() | [binary()] | cp()
                 Found = [part()]

              Same as matches(Subject, Pattern, []).

       matches(Subject, Pattern, Options) -> Found

              Types:

                 Subject = binary()
                 Pattern = binary() | [binary()] | cp()
                 Found = [part()]
                 Options = [Option]
                 Option = {scope, part()}
                 part() = {Start :: integer() >= 0, Length :: integer()}

              As  match/2,  but Subject is searched until exhausted and a list
              of all non-overlapping parts matching Pattern  is  returned  (in
              order).

              The  first and longest match is preferred to a shorter, which is
              illustrated by the following example:

              1> binary:matches(<<"abcde">>,
                                [<<"bcde">>,<<"bc">>,<<"de">>],[]).
              [{1,4}]

              The result shows that <<"bcde">>  is  selected  instead  of  the
              shorter match <<"bc">> (which would have given raise to one more
              match, <<"de">>). This corresponds to the behavior of POSIX reg-
              ular  expressions (and programs like awk), but is not consistent
              with alternative matches in re (and Perl), where instead lexical
              ordering in the search pattern selects which string matches.

              If  none  of the strings in a pattern is found, an empty list is
              returned.

              For a description of Pattern, see compile_pattern/1. For  a  de-
              scription of available options, see match/3.

              If {scope, {Start,Length}} is specified in the options such that
              Start > size of Subject, Start + Length < 0 or Start + Length is
              > size of Subject, a badarg exception is raised.

       part(Subject, PosLen) -> binary()

              Types:

                 Subject = binary()
                 PosLen = part()

              Extracts the part of binary Subject described by PosLen.

              A  negative  length can be used to extract bytes at the end of a
              binary:

              1> Bin = <<1,2,3,4,5,6,7,8,9,10>>.
              2> binary:part(Bin, {byte_size(Bin), -5}).
              <<6,7,8,9,10>>

          Note:
              part/2 and part/3 are also available in the erlang module  under
              the  names  binary_part/2  and binary_part/3. Those BIFs are al-
              lowed in guard tests.

              If PosLen in any way references outside the binary, a badarg ex-
              ception is raised.

       part(Subject, Pos, Len) -> binary()

              Types:

                 Subject = binary()
                 Pos = integer() >= 0
                 Len = integer()

              Same as part(Subject, {Pos, Len}).

       referenced_byte_size(Binary) -> integer() >= 0

              Types:

                 Binary = binary()

              If a binary references a larger binary (often described as being
              a subbinary), it can be useful to get the size of the referenced
              binary.  This  function  can be used in a program to trigger the
              use of copy/1. By copying a  binary,  one  can  dereference  the
              original, possibly large, binary that a smaller binary is a ref-
              erence to.

              Example:

              store(Binary, GBSet) ->
                NewBin =
                    case binary:referenced_byte_size(Binary) of
                        Large when Large > 2 * byte_size(Binary) ->
                           binary:copy(Binary);
                        _ ->
                           Binary
                    end,
                gb_sets:insert(NewBin,GBSet).

              In this example, we chose to copy the binary content before  in-
              serting  it in gb_sets:set() if it references a binary more than
              twice the data size we want to keep. Of course, different  rules
              apply when copying to different programs.

              Binary sharing occurs whenever binaries are taken apart. This is
              the fundamental reason why binaries are fast, decomposition  can
              always  be done with O(1) complexity. In rare circumstances this
              data sharing is however undesirable, why this function  together
              with copy/1 can be useful when optimizing for memory use.

              Example of binary sharing:

              1> A = binary:copy(<<1>>, 100).
              <<1,1,1,1,1 ...
              2> byte_size(A).
              100
              3> binary:referenced_byte_size(A).
              100
              4> <<B:10/binary, C:90/binary>> = A.
              <<1,1,1,1,1 ...
              5> {byte_size(B), binary:referenced_byte_size(B)}.
              {10,10}
              6> {byte_size(C), binary:referenced_byte_size(C)}.
              {90,100}

              In  the  above  example, the small binary B was copied while the
              larger binary C references binary A.

          Note:
              Binary data is shared among processes. If another process  still
              references the larger binary, copying the part this process uses
              only consumes more memory and does not free up the larger binary
              for  garbage  collection.  Use  this kind of intrusive functions
              with extreme care and only if a real problem is detected.

       replace(Subject, Pattern, Replacement) -> Result

              Types:

                 Subject = binary()
                 Pattern = binary() | [binary()] | cp()
                 Replacement = Result = binary()

              Same as replace(Subject, Pattern, Replacement,[]).

       replace(Subject, Pattern, Replacement, Options) -> Result

              Types:

                 Subject = binary()
                 Pattern = binary() | [binary()] | cp()
                 Replacement = binary()
                 Options = [Option]
                 Option = global | {scope, part()} | {insert_replaced, InsPos}
                 InsPos = OnePos | [OnePos]
                 OnePos = integer() >= 0
                   An integer() =< byte_size(Replacement)
                 Result = binary()

              Constructs a new binary by replacing the parts in Subject match-
              ing Pattern with the content of Replacement.

              If  the matching subpart of Subject giving raise to the replace-
              ment is to be inserted in the result,  option  {insert_replaced,
              InsPos} inserts the matching part into Replacement at the speci-
              fied position (or positions) before inserting  Replacement  into
              Subject.

              Example:

              1> binary:replace(<<"abcde">>,<<"b">>,<<"[]">>, [{insert_replaced,1}]).
              <<"a[b]cde">>
              2> binary:replace(<<"abcde">>,[<<"b">>,<<"d">>],<<"[]">>,[global,{insert_replaced,1}]).
              <<"a[b]c[d]e">>
              3> binary:replace(<<"abcde">>,[<<"b">>,<<"d">>],<<"[]">>,[global,{insert_replaced,[1,1]}]).
              <<"a[bb]c[dd]e">>
              4> binary:replace(<<"abcde">>,[<<"b">>,<<"d">>],<<"[-]">>,[global,{insert_replaced,[1,2]}]).
              <<"a[b-b]c[d-d]e">>

              If  any  position  specified in InsPos > size of the replacement
              binary, a badarg exception is raised.

              Options global and {scope, part()} work as for split/3. The  re-
              turn type is always a binary().

              For a description of Pattern, see compile_pattern/1.

       split(Subject, Pattern) -> Parts

              Types:

                 Subject = binary()
                 Pattern = binary() | [binary()] | cp()
                 Parts = [binary()]

              Same as split(Subject, Pattern, []).

       split(Subject, Pattern, Options) -> Parts

              Types:

                 Subject = binary()
                 Pattern = binary() | [binary()] | cp()
                 Options = [Option]
                 Option = {scope, part()} | trim | global | trim_all
                 Parts = [binary()]

              Splits  Subject into a list of binaries based on Pattern. If op-
              tion global is not specified, only the first occurrence of  Pat-
              tern in Subject gives rise to a split.

              The  parts  of  Pattern found in Subject are not included in the
              result.

              Example:

              1> binary:split(<<1,255,4,0,0,0,2,3>>, [<<0,0,0>>,<<2>>],[]).
              [<<1,255,4>>, <<2,3>>]
              2> binary:split(<<0,1,0,0,4,255,255,9>>, [<<0,0>>, <<255,255>>],[global]).
              [<<0,1>>,<<4>>,<<9>>]

              Summary of options:

                {scope, part()}:
                  Works as in match/3 and matches/3. Notice that this only de-
                  fines  the scope of the search for matching strings, it does
                  not cut the binary before splitting. The  bytes  before  and
                  after  the scope are kept in the result. See the example be-
                  low.

                trim:
                  Removes trailing empty parts of the result (as does trim  in
                  re:split/3.

                trim_all:
                  Removes all empty parts of the result.

                global:
                  Repeats  the  split until Subject is exhausted. Conceptually
                  option global makes split work on the positions returned  by
                  matches/3,  while it normally works on the position returned
                  by match/3.

              Example of the difference between a scope and taking the  binary
              apart before splitting:

              1> binary:split(<<"banana">>, [<<"a">>],[{scope,{2,3}}]).
              [<<"ban">>,<<"na">>]
              2> binary:split(binary:part(<<"banana">>,{2,3}), [<<"a">>],[]).
              [<<"n">>,<<"n">>]

              The return type is always a list of binaries that are all refer-
              encing Subject. This means that  the  data  in  Subject  is  not
              copied  to new binaries, and that Subject cannot be garbage col-
              lected until the results of the split are no longer referenced.

              For a description of Pattern, see compile_pattern/1.

Ericsson AB                       stdlib 3.13                     binary(3erl)
Man(1) output converted with man2html
list of all man pages