is one of the words ]
;; [ that was passed to OPEN-INPUT. ]
;; [---------------------------------------------------------------------------]
READ-RECORD: does [
;; pick a line if there are lines left to be picked
RECORD-NUMBER: RECORD-NUMBER + 1
if (RECORD-NUMBER > FILE-SIZE) [
EOF: true
return EOF
]
RECORD-AREA: copy ""
RECORD-AREA: copy pick FILE-DATA RECORD-NUMBER
;; Set the words passed to the "open" function to values extracted
;; out of the data, based on the locations passed to the "open" function.
;; Put those words and values in the RECORD object.
RECORD: make object! []
foreach [FIELDNAME POSITION] FIELDS [
RECORD-AREA: head RECORD-AREA
RECORD-AREA: skip RECORD-AREA (POSITION/x - 1)
RECORD: make RECORD compose [
(to-set-word FIELDNAME) copy/part RECORD-AREA POSITION/y]
]
return EOF
]
;; [---------------------------------------------------------------------------]
;; [ Open a file for output. What does that mean? ]
;; [ A common way of working with files in REBOL is to have the whole file ]
;; [ in memory, so we will do that. ]
;; [ We will clear out our data areas, and then when we "write" to the file ]
;; [ we will add a formatted line to the data area, and then write the ]
;; [ whole data area to disk when we "close" the file. ]
;; [ To make the supplied field names available for values, we will create ]
;; [ a RECORD object out of the supplied names. ]
;; [ The caller will set values in FFF/RECORD/data-name. ]
;; [---------------------------------------------------------------------------]
OPEN-OUTPUT: func [
FILEID [file!]
FIELDLIST [block!]
] [
FILE-ID: FILEID
FIELDS: copy FIELDLIST
FILE-DATA: copy []
FILE-SIZE: 0
RECORD-NUMBER: 0
EOF: false
RECORD: make object! []
foreach [FIELDNAME POSITION] FIELDS [
RECORD: make RECORD compose [
(to-set-word FIELDNAME) {""}]
]
]
;; [---------------------------------------------------------------------------]
;; [ When writing a file, we have to have a "close" procedure to actually ]
;; [ put the data into a disk file. ]
;; [---------------------------------------------------------------------------]
CLOSE-OUTPUT: does [
write/lines FILE-ID FILE-DATA
]
;; [---------------------------------------------------------------------------]
;; [ Write a record. What does this mean? ]
;; [ The caller will have set values to the words passed to the "open" ]
;; [ function, using the RECORD oject created at open time. ]
;; [ That is, set a value to FFF/RECORD/data-name. ]
;; [ What we do with them is to put the values of those words ]
;; [ into the specified positions in the record area, and then append the ]
;; [ record area to the data area. ]
;; [ To build the record area, we can't append because we might not be ]
;; [ adding data from front to back; we can't insert because that might ]
;; [ move previously-inserted data. So we will have to make a big blank ]
;; [ string, "change" data, and then trim off the right end. ]
;; [ Remember that our data file is "line sequential" which means that the ]
;; [ lines end with an LF and can vary in length. ]
;; [---------------------------------------------------------------------------]
WRITE-RECORD: does [
RECORD-AREA: make string! 1028
foreach [FIELDNAME POSITION] FIELDS [
RECORD-AREA: head RECORD-AREA
RECORD-AREA: skip RECORD-AREA (POSITION/x - 1)
change/part RECORD-AREA RECORD/:FIELDNAME POSITION/y
]
RECORD-AREA: head RECORD-AREA
RECORD-AREA: trim/tail RECORD-AREA
append FILE-DATA RECORD-AREA
]
]
Here is a demo using the above module. To make the demo work syntactically,
you will have to save the above module as "fffobj.r" on your own computer.
REBOL [
title: "FFF object demo"
]
;; [---------------------------------------------------------------------------]
;; [ Show how to use the fixed-format file object. ]
;; [---------------------------------------------------------------------------]
do %fffobj.r
TEST-FIXED-FILE-ID: %test-fixedformat.txt
FFF1: make FFF [] ;; make an instance of the FFF object
FFF1/OPEN-INPUT TEST-FIXED-FILE-ID [ ;; read the file, make column heading words
NAME 1X10
ADDRESS 11X20
PHONE 31X10
DATE 41X11
AMT 52X7
CODE 59X2
COUNT 61X2
]
FFF1/READ-RECORD ;; read first record to get set up for 'until' loop
until [ ;; do this loop until last item in it becomes true
probe FFF1/RECORD
print rejoin ["NAME ='" FFF1/RECORD/NAME "' of type " type? FFF1/RECORD/NAME]
print rejoin ["ADDRESS ='" FFF1/RECORD/ADDRESS "' of type " type? FFF1/RECORD/ADDRESS]
print rejoin ["PHONE ='" FFF1/RECORD/PHONE "' of type " type? FFF1/RECORD/PHONE]
print rejoin ["DATE ='" FFF1/RECORD/DATE "' of type " type? FFF1/RECORD/DATE]
print rejoin ["AMT ='" FFF1/RECORD/AMT "' of type " type? FFF1/RECORD/AMT]
print rejoin ["CODE ='" FFF1/RECORD/CODE "' of type " type? FFF1/RECORD/CODE]
print rejoin ["COUNT ='" FFF1/RECORD/COUNT "' of type " type? FFF1/RECORD/COUNT]
print "-------------------------------------------"
FFF1/READ-RECORD ;; reading next record at end of loop returns EOF flag
]
halt
Note again how you "open" the file and supply the function with names and
locations and lengths of the "fields" in the data record.
The "read" procedure will create an object, called RECORD, with those column
names and values assigned to them.
Here is the result of the above demo.
make object! [
NAME: "Jordan "
ADDRESS: "1801 Main St "
PHONE: "6129261001"
DATE: "01-JAN-2001"
AMT: "0123456"
CODE: "X1"
COUNT: "21"
]
NAME ='Jordan ' of type string
ADDRESS ='1801 Main St ' of type string
PHONE ='6129261001' of type string
DATE ='01-JAN-2001' of type string
AMT ='0123456' of type string
CODE ='X1' of type string
COUNT ='21' of type string
-------------------------------------------
make object! [
NAME: "James "
ADDRESS: "1801 Main St "
PHONE: "6129261002"
DATE: "02-FEB-2002"
AMT: "0234567"
CODE: "X1"
COUNT: "22"
]
NAME ='James ' of type string
ADDRESS ='1801 Main St ' of type string
PHONE ='6129261002' of type string
DATE ='02-FEB-2002' of type string
AMT ='0234567' of type string
CODE ='X1' of type string
COUNT ='22' of type string
-------------------------------------------
make object! [
NAME: "Jeremy "
ADDRESS: "1801 Main St "
PHONE: "6129261003"
DATE: "03-MAR-2004"
AMT: "0345678"
CODE: "X1"
COUNT: "23"
]
NAME ='Jeremy ' of type string
ADDRESS ='1801 Main St ' of type string
PHONE ='6129261003' of type string
DATE ='03-MAR-2004' of type string
AMT ='0345678' of type string
CODE ='X1' of type string
COUNT ='23' of type string
-------------------------------------------
>>
To summarize what you have seen above, REBOL is not natively "at home"
in the world of fixed-format data, but it has some nice tricks up its
sleeve in its ability to write its own code and run time, so we can
use those tricks to make it very easy to access data in text files of
this kind. If you expect to just report on this data, you are set.
If you want to do any calculations, then you will have to use the
various REBOL "to-" functions to convert strings in the file to the
needed data types.
===But wait, there's more
With REBOL's "data is code" features, one might wonder what other ways
REBOL can do things at run time that would be done at comple time in
other languages.
---HTML report module
Here is a module and a demo that builds on the CSV
object previously presented. This module can be used to present a basic
columnar report of data items specified at run time. One calls a
procedure with a list of words, and the procedure evaluates those words
and puts them into html markup. The words to be reported on are not
known until run time. Here is the module. To run the coming demo,
save it as "htmlrep.r" on your computer. In this module, there is a
lot of documentation in comments before the REBOL header.
TITLE
HTML report
SUMMARY
This is a module to help make a "report" that is directed to an html table.
It provides services to "open" and "close" the report, and to emit heading
and detail lines. The result will be a single html file for viewing on
a screen. For a paper copy of the "report," one would print the html page.
The module does not provide any page breaks that would make the printed
version of this page look good. Controlling printing to physical paper
is not part of the mission of html.
DOCUMENTATION
Load the module into your program with:
do %htmlrep.r
Before the first call:
1. Put a file name in HTMLREP-FILE-ID. This should be a value with
the type of "file." In other words, put a percent sign in front of it.
2. Put a value in HTMLREP-TITLE.
3. Put a program in HTMLREP-PROGRAM-NAME. This will appear in a footer.
4. call HTMLREP-OPEN.
Optionally, before "printing" the first detail line, call HTMLREP-EMIT-HEAD
in the following manner:
HTMLREP-EMIT-HEAD ["literal-1" ... "literal-n"]
where literal-1, etc., are strings to be turned into entries.
To "print" a line of data, call HTMREP-EMIT-LINE in the following manner:
HTMLREP-EMIT-LINE reduce [word-1...word-n]
where word-n is the word whose value you want to print. The procedure will
generate a | entry for each word, in one row of an html table.
Historical note: In the first version of this module, we just passed the
words in a block and did not reduce the block, and the HTMLREP-EMIT-LINE
procedure used the "get" function to get the values of the words.
This turned out not to work if the words passed in were in an object, so
we moved the "reduction" process up to the level of the caller.
Now we pass values to HTMLREP-EMIT-LINE instead of words.
At the end:
Call HTMLREP-CLOSE. You MUST do this step because all the other procedures
just build up an html string in memory. The HTMLREP-CLOSE procedure actually
writes the data to disk under the name you loaded into HTMLREP-FILE-ID.
SCRIPT
REBOL [
Title: "HTML report"
]
;; [---------------------------------------------------------------------------]
;; [ Items set up by the caller. ]
;; [---------------------------------------------------------------------------]
HTMLREP-FILE-ID: %htmlrep.html
HTMLREP-TITLE: " "
HTMLREP-PRE-STRING: " "
HTMLREP-POST-STRING: " "
HTMLREP-PROGRAM-NAME: " "
HTMLREP-CODE-BLOCK: " "
;; [---------------------------------------------------------------------------]
;; [ Internal working items. ]
;; [---------------------------------------------------------------------------]
HTMLREP-FILE-OPEN: false
;; [---------------------------------------------------------------------------]
;; [ This is the top of the html page. ]
;; [---------------------------------------------------------------------------]
HTMLREP-PAGE-HEAD: {
<%HTMLREP-TITLE%>
|
REBOL Reporting Services |
Created on: <% now %>
<%HTMLREP-TITLE%>
<% HTMLREP-PRE-STRING %>
}
;; [---------------------------------------------------------------------------]
;; [ This is the end of the html page. ]
;; [---------------------------------------------------------------------------]
HTMLREP-PAGE-FOOT: {
<% HTMLREP-POST-STRING %>
The above report was produced by the Information Systems Division.
Refer to a program called "<% HTMLREP-PROGRAM-NAME %>."
<% HTMLREP-CODE-BLOCK %>
}
;; [---------------------------------------------------------------------------]
;; [ This is the area where we will build up the html page in memory. ]
;; [---------------------------------------------------------------------------]
HTMLREP-PAGE: make string! 5000
;; [---------------------------------------------------------------------------]
;; [ This is the procedure to "open" the report. ]
;; [ The "build-markup" function will replace the placeholders in the html ]
;; [ with the values resulting from their evaluation. ]
;; [---------------------------------------------------------------------------]
HTMLREP-OPEN: does [
HTMLREP-PAGE: copy ""
append HTMLREP-PAGE build-markup HTMLREP-PAGE-HEAD
append HTMLREP-PAGE newline
HTMLREP-FILE-OPEN: true
]
;; [---------------------------------------------------------------------------]
;; [ This is the procedure to "close" the report. ]
;; [ It writes to disk the html page we have built up in memeory. ]
;; [---------------------------------------------------------------------------]
HTMLREP-CLOSE: does [
append HTMLREP-PAGE build-markup HTMLREP-PAGE-FOOT
append HTMLREP-PAGE newline
write HTMLREP-FILE-ID HTMLREP-PAGE
HTMLREP-FILE-OPEN: false
]
;; [---------------------------------------------------------------------------]
;; [ This procedure emits a row of an html table containing heading ]
;; [ elements supplied by the caller in a block of strings. ]
;; [---------------------------------------------------------------------------]
HTMLREP-EMIT-HEAD: func [
"Emit a heading row with literals supplied in a block"
HTMLREP-HEADING-BLOCK [block!]
] [
append HTMLREP-PAGE " | "
foreach HTMLREP-HEAD-LIT HTMLREP-HEADING-BLOCK [
append HTMLREP-PAGE ""
append HTMLREP-PAGE to-string HTMLREP-HEAD-LIT ; to-string just in case
append HTMLREP-PAGE " | " ; caller supplied words
]
append HTMLREP-PAGE "
"
append HTMLREP-PAGE newline
]
;; [---------------------------------------------------------------------------]
;; [ This procedure emits a row of an html table containing the values of ]
;; [ words supplied by the caller in a block. ]
;; [ Note the requirement that the caller "reduce" the block passed to this ]
;; [ function so that we are getting values and not words. ]
;; [---------------------------------------------------------------------------]
HTMLREP-EMIT-LINE: func [
"Emit a detail row with values supplied in a block"
HTMLREP-DETAIL-BLOCK [block!]
] [
append HTMLREP-PAGE ""
foreach HTMLREP-VALUE HTMLREP-DETAIL-BLOCK [
append HTMLREP-PAGE ""
append HTMLREP-PAGE HTMLREP-VALUE
append HTMLREP-PAGE " | "
]
append HTMLREP-PAGE "
"
append HTMLREP-PAGE newline
]
Now, using the above html reporting module, the CSV object module, and the
CSV test data file from the previous script that made our test data, you
can run the following demo to make a quick html listing of the CSV data.
REBOL [
Title: "Show usage of csvobj.r and htmlrep.r"
]
do %csvobj.r
do %htmlrep.r
TEST-CSV-FILE-ID: %test-csvformat.csv
DEMO-REPORT-FILE-ID: %test-csvlisting.html
;; [---------------------------------------------------------------------------]
;; [ Create a CSV object for the above-mentioned file. ]
;; [ Bring the file into memory. ]
;; [ Read the first record to prepare for looping through all records. ]
;; [---------------------------------------------------------------------------]
DEMOCSV: make CSV []
DEMOCSV/CSVOPEN TEST-CSV-FILE-ID
DEMOCSV/CSVREAD
;; [---------------------------------------------------------------------------]
;; [ Prepare the html report. Load headings, set file names, etc. ]
;; [---------------------------------------------------------------------------]
HTMLREP-FILE-ID: DEMO-REPORT-FILE-ID
HTMLREP-TITLE: copy "Quick CSV file listing"
HTMLREP-PROGRAM-NAME: copy "csvhtmldemo.r"
HTMLREP-OPEN
HTMLREP-EMIT-HEAD DEMOCSV/HEADINGS
;; [---------------------------------------------------------------------------]
;; [ Loop until the CSVREAD function returns the EOF marker (End Of File). ]
;; [ We do have to do a bit of data conversion, as the modules currently ]
;; [ are written. ]
;; [ HTML-EMIT-LINE expects a block of values. ]
;; [ The items in DEMOCSV/HEADINGS are strings, and so must be converted to ]
;; [ words so that they can be evaluated and their values appended to ]
;; [ VALUE-BLOCK. ]
;; [ But still, that's not a lot of work. ]
;; [---------------------------------------------------------------------------]
until [
VALUE-BLOCK: copy []
foreach WORD DEMOCSV/HEADINGS [
VALUE-NAME: to-word WORD
append VALUE-BLOCK DEMOCSV/RECORD/:VALUE-NAME
]
HTMLREP-EMIT-LINE VALUE-BLOCK
DEMOCSV/CSVREAD
]
;; [---------------------------------------------------------------------------]
;; [ Put the output file on disk and show it to confirm we are done. ]
;; [---------------------------------------------------------------------------]
HTMLREP-CLOSE
browse DEMO-REPORT-FILE-ID
---Simple lookup table
Here is a way to use a CSV file to make a simple lookup table.
This will require the CSV object from above, a bit of copying and pasting
from below, and running a demo script to follow. Or, you could just read
about it since it is not complicated.
To start, copy the following lines and paste them into a text editor,
and save them as "postalcodes.csv" on your computer. The are a handful
of United States postal codes (or state abbreviations) just so we can have
some demo data to work with. If you copy them out and get leading
indentations, you will have to remove those by hand. They have the
leading spaces in this document to make them look like code, but we
don't want the leading spaces in the file.
POSTALCODE,STATENAME
AL,Alabama
AK,Alaska
MN,Minnesota
WI,Wisconsin
ND,North Dakota
SD,South Dakota
The list was short because this is a demo. Now, copy out the following
script and run it. You will have to save it as a script file because it
is going to run the csvobj.r module that we made previously.
What this demo will do is pull the data out of the file we just made,
and save it on disk in a way such that REBOL can load it with the
"load" function. When it is loaded in that manner, it will become a
block that can be searched with the "select" function.
REBOL [
title: "Make postal code table"
]
do %csvobj.r
POSTAL-TABLE: copy []
POSTAL-FILE: %postalcodes.txt
POSTALCODES: make CSV []
POSTALCODES/CSVOPEN %postalcodes.csv
POSTALCODES/CSVREAD
until [
append POSTAL-TABLE POSTALCODES/RECORD/POSTALCODE
append POSTAL-TABLE POSTALCODES/RECORD/STATENAME
POSTALCODES/CSVREAD
]
save POSTAL-FILE POSTAL-TABLE
alert "Done"
And now, run the following demo. It will load the postal code table
created above, in a format that REBOL can work with, and, since the
postal codes are not duplicated anywhere in the state names, we can
use the REBOL "select" function to obtain a state name based on the
postal code.
REBOL [
title: "Demo postal code table"
]
POSTAL-TABLE: copy []
POSTAL-TABLE: load %postalcodes.txt
print ["MN is" select POSTAL-TABLE "MN"]
print ["AL is" select POSTAL-TABLE "AL"]
print ["SD is" select POSTAL-TABLE "SD"]
print ["VT is" select POSTAL-TABLE "VT"]
halt
Here is the result:
MN is Minnesota
AL is Alabama
SD is South Dakota
VT is none
>>
===Here there be monsters
The examples above do not look like examples from other sources on the
internet. Why might that be? For a beginner, it can be helpful to
plod along deliberately, to keep things straignt in one's head.
Use temporary variables for intermediate results, use global variables
so they can be probed, write your own loops so you can display results
as the program runs, things like that. Computers are so fast now that
one can forget that everything has a cost.
Without knowing how REBOL works on the inside, we can't know exactly
what costs there are to different things, but are there some assumptions
we could make?
One obvious assumption would be that any variable has a cost in memory.
So an obvious improvement in any program would be to avoid using
more variables than necessary. We could just adopt that as a general
rule.
Another assumption that might be valid is in the area of loops and
using REBOL functions. There are functions, like "copy" that
almost certainly have loops in them somewhere, down at some low
level. If one wanted to copy a string of characters, and coded
one's own loop that used "copy" as one of the statments in the
loop, might one be, at a low level, creating a loop within a loop?
The answer is, we don't know. But it might be safe to adopt,
as another general rule, using REBOL's functions whenever possible
instead of reinventing things, even if the re-invention helps in
your understanding of your own program.
And is there a more general rule that includes the above two rules
plus others that we might not be aware of? Looking at REBOL code
on the internet, from people who are highly skilled with it, it
appears that the general rule might be just to keep the code
compact. The less you say, the more likely it is that you are
using the REBOL functions to best effect and not doing things that
are not necessary.
With that general principle in mind, let's revisit some of the
above functions and try to streamline them a bit.
---SPACEFILL, improved
Here is a more compact version, with notes following.
The notes will refer to the SPACEFILL function defined earlier.
REBOL [
title: "SPACEFILL function, improved"
]
SPACEFILL: func [
"Left justify a string, pad with spaces to specified length"
INPUT-STRING
FINAL-LENGTH
] [
head insert/dup tail copy/part trim INPUT-STRING FINAL-LENGTH #" " max 0 FINAL-LENGTH - length? INPUT-STRING
]
;; Uncomment to test
;print rejoin ["'" SPACEFILL " ABCD1234 " 10 "'"]
;halt
First, let's be sure we understand it.
REBOL functions are evaluated from left to right, which means we have to
work our way into the innermost function first because that produces the
results passed to the functions to the left.
The innermost function is "trim" which takes the spaces off both ends
of the INPUT-STRING.
The next function is the "copy/part" which makes a copy of the INPUT-STRING
but for only as many characters as specified in the FINAL-LENGTH.
The reason for this is that the caller might have asked for a final length
less than the actual length of the data being padded. That makes no
sense, but it must be accounted for. If the caller asked for a final
length greater that the INPUT-STRING, as would be normal, the "copy/part"
will copy only as many characters as there actually are in the trimmed
INPUT-STRING.
The next function is "tail" which positions us to the end of the
trimmed and copied string.
The next function is the "insert/dup" function which adds the "space"
character (#" ") to the tail of the copied string, for a specified number
of times. And what is that specified number? It is the maximum of zero
(in case we don't have to add any) or however many more spaces we need to
reach the desired length. And how many is that? It is the FINAL-LENGTH
minus the number of characters we already have, which is the current
length of the INPUT-STRING.
And finally, to make sure we return to the caller the padded version of
INPUT-STRING, we position ourselves to the head of INPUT-STRING.
Now let's note the improvements.
There are no local variables, compared to our previous version.
We trim the INPUT-STRING, but we don't have to store it in a tempoary
variable because we can just pass it up the line of function calls.
Similarly, the LENGTH-OF-TRIMMED-STRING and NUMBER-OF-SPACES-TO-ADD
are calculated oh the fly and don't need temporary variables.
And the FINAL-PADDED-STRING is not necessary because we just pad the
INPUT-STRING and pass that back to the caller.
And finally, to go the last step in REBOL-izing the original SPACEFILL
function, we will shorten up some of our variables and condense the
code a bit to get:
REBOL [
title: "SPACEFILL function, improved"
]
SPACEFILL: func [txt len] [head insert/dup tail copy/part trim txt len #" " max 0 len - length? txt]
;; Uncomment to test
;print rejoin ["'" SPACEFILL " ABCD1234 " 10 "'"]
;halt
---SPACEFILL-LEFT, improved
Modeling after our efforts to streamline SPACEFILL (thanks to some help
from the REBOL community on the internet), here a shorter version of
SPACEFILL-LEFT which adds padding on the left.
REBOL [
title: "SPACEFILL-LEFT function, improved"
]
SPACEFILL-LEFT: func [
"Right justify a string, pad with spaces to specified length"
INPUT-STRING
FINAL-LENGTH
] [
trim INPUT-STRING
either FINAL-LENGTH > length? INPUT-STRING [
return head insert/dup INPUT-STRING " " FINAL-LENGTH - length? INPUT-STRING
] [
return copy/part INPUT-STRING FINAL-LENGTH
]
]
;; Uncomment to test
;print rejoin [{'} SPACEFILL-LEFT " ABCD1234 " 10 {'}]
;print rejoin [{'} SPACEFILL-LEFT " XXX YYY 123 " 10 {'}]
;halt
This is not quite as compact, but it does take out some stuff that is not
needed.
The temporary variables are gone because what they held can be derived
within a line of function calls. The "trim" function does not copy the
string that is trimmed, so it is not necessary to have a temporary copy
of the trimmed INPUT-STRING. The "insert/dup" starts inserting at the
head, so we don't need a loop to keep returning to the head and adding
a space there.
In other languages, depending on the language, one would have to make
temporary variables, counters, and such, to accomplish something.
REBOL uses the method of calling functions and having the results feed
other functions, so one can do away with some of what is needed in
other languages. This method is part of REBOL's power, the need for
less code. If you are familiar with "more code," you can write that
way to start using REBOL. There are other areas where REBOL has power,
and it would be a shame to lose that power just because you can't write
the most compact REBOL. But as you get handier with REBOL, you can
start making your code more compact, and tap into that next level
of power.
===And in conclusion
This document tries to fill in a space between a reference and a tutorial.
A reference gives details about how to use specific features, but does
not necessarily explain how to put those features together to solve a
problem. A tutorial shows examples of how to do things but not necessarily
in great detail if the tutorial is trying to explain a lot of things
without being a huge document. This document takes one problem and tries
to explain in some detail how to use REBOL to solve it.
The problem being addressed here is what to do when you come up against
a CSV or fixed-format file and want to get the data items out of it to
do something useful. If you know REBOL, then that problem probably
would be trivial to solve for you. But if you don't know REBOL well,
and are experiencing the "where do I start" reaction, the tips and tools
here might help.