Thursday, October 15, 2009

Accurate PIC Delays

Time to learn PIC's all over again.
This entry is written very rough draft jotting down notes style and I apologize for the style.

I have PIC-PG4D from Olimex/Spark Fun Electronics. Spark Fun no longer calls it PG4D, and it's not listed under the PIC programmer section, but they still sell it for $27.95 under the sku: DEV-00001 in development tools http://www.sparkfun.com/commerce/product_info.php?products_id=1, called Development Board with Onboard Programmer. I think this is by far the best value to get started with PICs even today, compared to USB programmers.

It's a serial JDM-type programmer, and because it's JDM-alike, many generic PIC programming software tools work with it. My favorite programmer software is the command line "picprog" listed at http://hyvatti.iki.fi/~jaakko/pic/picprog.html. I use any old text editor to edit the assembler code, and gpasm to compile the hex file that can be burned onto the chip via the programmer.

I used to have a parallel port programmer, but the parallel port has disappeared from today's laptops, and they only come with USB ports anymore. USB to parallel port converters don't work for pic programming. Seriously, just don't bother. There is someone who sells custom USB2LPT parallel converters, but even he says that with each IN instruction the USB frame must be waited on for 125 us, and this can increase a pic programming time 100x times. So if it took to program a chip under a second via a conventional parallel port programmer, it may take over a minute to accomplish the same thing via one of these converters, if they only worked at all. Just don't bother. Unless you have a PCMCIA slot, or a desktop computer with a free PCI slot, each of which accept regular fast parallel port adapters, so if your only option is USB, then a parallel port programmer is out of question, and you can only use USB or USB/serial adapters.
USB adapters are becoming the norm, but for starting out and really learning the guts of what's going on, the serial port is still the best option. Unfortunately most USB/serial converters don't provide the classical -12V/+12V sufficient voltage to program a chip, but they most likely to go 0 to 5 V, or something intermediate between 5V and 12V, such as 8 V, which still works well enough with most serial port communication functions. Luckily I bought a Belkin F5U216 USB Dock station for like $20 or so back in 2004 at Best Buy. It's a really clumsy device, with a useless VGA passthrough cable that adds to the cable mess you have to drag around with the laptop, but it does have its own separate power adaptor, and it seems to supply enough voltage to program a PIC. It's an FTDI chip based device, and another less clumsy dock station that's PL2303 based, I had no luck with as far as programming voltage goes. Your luck may vary with these USB to serial adapters, and may have to go through quite a few til you find one that's able to program chips via USB/serial JDM adapters. But ultimately it may be worth the effort. The alternative of buying a separate USB programmer, and then using a regular low voltage USB/serial adapter to talk to the chip may be in the end cheaper, but it requires moving the chip from socket to socket, and the pins may be bent and broken off. Another option, of course, is to directly supply the programming voltage, and use a transistor to switch it based on the low voltage it gets from the serial port. But this you have to make yourself, and can't buy a kit that's ready made, and guaranteed to work. I wonder why olimex/kitsrus and the rest don't sell serial programmers that either work just off the serial port, or have an option to connect a high voltage source in case the serial port doesn't provide it. After all, internally, the USB based programmers also use a USB/serial converter. Having just a serial programmer, and a separate USB/serial adapter frees up the USB/serial adapter for other uses too. The picprog author says that USB to serial adapaters will work slowly, because the serial control lines need to be toggled, and each of those operations takes milliseconds, and a full programming of a chip up to an hour. But my Belkin F5U216 USB/serial dock station programs a PIC16F628A in like 3 to 5 seconds.

Once you have a programmer, you can solder many fun circuits you can find all over the web. The classic chip is PIC16F84A, which is what most classic tutorials and circuits are about. It's a very good midpoint of the spectrum to dive in at, to start out learning, but eventually you'd move on to the newer and cheaper pics, or either the lower performance 10F/12F series, or the higher performance 18F series. Actually the cheaper 16F628 is equivalent to 16F84 if the comparator registers are turned off, and it's recommended.

Microchip has all the datasheets you need, and they are extremely well documented. Currently, Allied Electronics, with a minimum order amount of $30, sells PIC's very cheaply: search for Pic, then limit the search to I/P so you don't get surface mount SOIC's. PIC10F200 is $0.46, PIC12F683 $1.18, PIC16F54 $0.55, PIC16F628A $1.73, PIC16F88 is $2.60, and they no longer sell the PIC16F84A, even though last week they still did. Out of the above bunch the PIC16F628A is recommended to start, and should be supported by most programmers and software you can find around the net. The PIC16F88 is the candy/king of the bunch, and still sufficiently F84-like to get started with.

The best starting point for absolute beginners, and programmers who've never seen assembler programming in their life, is http://www.mstracey.btinternet.co.uk/pictutorial/picmain.htm. It's PIC16F84 based, but it's directly applicable to the 628A, so don't worry.


One of the benefits of PIC programming is accurate timing on the microsecond scale. While in the past one could use the IBM PC with MSDOS directly to control external devices through the parallel port, these days, most modern multitasking operating systems no longer allow accurate timing, and delays/hiccups of CPU availability on the order of 250 milliseconds or more should be considered. If your application needs to log something once an hour or so, that is more than sufficient, but if you need exact timing, such as talking to a DS18B20 thermometer, or a HD44780 LCD, direct PC control is almost out of reach. I remember in 2006 I was requested to create a software slowly ramping up voltage on a power supply, from 0 to 300V, to coat some electrocoated panels. With the timers provided by the windows API the fastest delay time was 20 ms inside windows NT, with unpredictable occasional hiccups of over 200 ms. When the ramp time is 15 seconds, a half a second hiccup near 80V along the ramp may or may not significantly affect the reproducibility of the test. Running on top of a nonrealtime OS, where the OS may capriciously decide to churn some harddrive, or dump some memory cache, or attempt a network connect timeout in the middle of what you're doing, that's an iffy situation. This nonrealtime preemptive issue keeps the computer from direct automation and control of things such as a nuclear power plant or a submarine, and direct control and accurate timing is handed off instead to dedicated chips, such as a sound chip, or a serial UART, etc. Hence the need for the PIC, and learning how to program it. If you want to control motors, chips, any kind of devices, and you want to make good scientific measurements, you can use a PIC as either a standalone computer with no harddrive or keyboard or display, or a PIC connected via a serial port to a PC, where the PIC is your accurate realtime buffer between the moody and unpredictable, and mysterious computer OS and the real world. I don't even know what programs are running on windows anymore, since many of them can be hidden even from the task manager. At the Linux command prompt only ps ax or top lists most running tasks, and it feels more secure, even if Linux is generally under very heavy sabotage, at least there is no hidden direct spy features built into it, because the sourcecode is available for inspection, unless gcc inserts something, but that sourcecode is available for inspection too. Also, under a network connected windows computer MS has direct access to it, and can any time piss in your cereal, when you want to make a real world measurement, execute a remote procedure call on you. If you shut down the rpc service on NT, a countdown messagebox starts and automatically reboots the computer. That's a big no no. RPC has to run at all times. If you don't like it, what else you gonna do? Go to a competitor? Good luck finding one. Running and isolated windows session may not be possible in a few years. This is also where the PIC's are a refuge, since they are meant to function standalone, no keyboard/monitor/harddisk/memory, just a nanowatt battery, and possibly a 3.5 mm earphone plug serial connection, or an RF/serial connection, or maybe a set of LED's or even an LCD. Oh what freedom it is. Slow, can't do much in it, but at least no bullshit, because what you can do, you can rely on. And redundancy is cheap.
Imagine making your own garage openers, temperature monitors, or even home security systems. Automating your world a' la Jetson style. The PIC's make it affordable. As long as there is a roughly equivalent competition, such as coming from the arduino AVR microcontrollers, or even the intel 8051, prices, affordability, andproper, customer focused market behavior should be naturally happening. If any one gets too successful and leaves the others behind or forces them completely out of business, the market could turn into a monopolistic nightmare.

So here I am trying to learn PIC's again. Every time I try, Da Man uproots me and gets me out on the street, without a roof over my head. I'll never learn.

As you start out learning PIC's, the very first things you'll do is flash an LED. Set up the input/output ports, and change the bit values on them on and off. Of course a PIC running at 20 MHz gets an instruction execution time of 0.2 microseconds, and that's too fast to see for the human eye, so you have to learn delays.
Since you're working at very low level, only byte numbers up to 255 are available, and any numbers such as 10 million have to be expressed/manipulated via such small values. Subtracting 1 from 0 rolls over to 255 in a byte, and adding 1 to 255 rolls over to 0, with the carry flag bits set. It seems like such a bother compared to modern C compilers, but what you get in exchange is knowing exactly what happens on the CPU, no mystery about it. The possibility of a virus infection brought to you by the compiler is nil, because you can decompile and examine each and every cpu instruction, and understand what it does exactly. Secure computing. It's only possible on very small scale, with very low complexity systems, but that's where everything starts, before scaling up. Talking about being a computer security expert without understanding assembly is ridiculous. I never had the chance to learn assembly before, I'm pretty fluent in BASIC, Pascal and C and from here of course in most similar high level languages like javascript/java/C++/python, but assembler has always been a mystery to me. There is always a chance that the c compiler is rigged, and unless you're able, at least in theory, to personally examine the compiler output, you can never be sure about your computer's security. Though the price is dear, another benefit of assembler programming is that you're able to use hardware directly without trespassing and violating someone else's copyright and intellectual property. You can write your own OS or your own compiler, if you wish. It used to be that computer scientists all knew the inner functionings of computers, and they were all able to program assembler. They didn't do so because of the benefits and ease that high level programming tools provided, but they understood what these tools ultimately did, and were able to create high performance and quality software, if necessary, dropping back to assembler and directly working with the hardware. That was true computer science. Today's programmers coming from diploma degree mills are slaves to the tools they are given. If these tools suck in performance, anything they create with them sucks. They don't know how to create better tools. They are told they are forbidden to even try to understand how a tool works, because that would involve reverse engineering. Eventually only an elite, a select few special circle of people will be allowed to program with high performance, directly on the hardware, and everyone else mandated to use the high level and expensive and remotely monitored tools, because otherwise it will be a trespass on fully locked down and perpetual patent rights. Only an elite select few people will have access to the patent rights, especially when patents will be set to never expire, or renewable ad infinitum. It's called a competitive advantage in the name of self interest. Only special people will be allowed to write high performance code, everyone else will have to live and run on top of artificially sabotaged and held back tools. We can even see that today, java and dotnet seem very much like that. Hence Vista sucks. DOS, with a parallel port, and Quake, used to rock. It blew the minds of their users with the hotrod speed.
In today's world being mandated to use developer tools forced onto users by monopolies feels like a forced religious conversion, a violation of the First Amendment of the Constitution. Moreover, the backdoors put into network connected computers, and the constant remote watchful eye of the proprietary system holder making sure no wrong clicks and no intellectual property violations are happening, intruding into the user's private homes through network wires, seems like a violation of the Fourth Amendment of the constitution. All the while viruses and hackers have no problems trespassing also, and in fact, they are used to further scare and intimidate end users into blindly obeying a centralized high command. Nazi style. These trespasses on individual freedoms are happening in an effort of self interest, as a power grub and control, by those doing them. End users are no longer in charge of their own lives, or their own destinies, at least as far as computing destinies are concerned. What can you do in view of all of this? Well you can use Linux, but that too is so bloated anymore and under such heavy sabotage anymore, (compare Knoppix 3.4 to 6.0, wonder if Klaus is still alive, and it's not just an imposter releasing newer versions acting like its him), that that too is no safe haven. PIC's are a safe haven in the sense that they are so small, there is no room to even run and OS, or do anything really complicated to deceive the user. It's just you against the bare cpu, and you can still get some very neat, exciting and useful things accomplished with it. Artificial intelligence is impossible with them, in view of the very limited resources and speeds available, at least compared to today's supercomputer simulators. The bang per buck, the risk of artificial intelligence development vs. the benefit of usefulness they provide is very high. And security is a given, and redudancy is cheap. In fact microcontrollers could safely run and automate nuclear power plants, space stations, cars, etc, with generally available and small learning curve skills by the whole population. Another world war, or nuclear holocaust, and the remnants of technology such as complex computers, would be unusable by Joe Schmoe, but a microcontroller could be, and rebuilding a world could be accelerated. If a space station fails, and 3 astronauts are stuck on it, they are simply unable to take care of things for themselves, and fix them, unless, of course, everything is easy to fix, everything is running on top of things they understand, such as microcontrollers. I come from a chemical manufacturing/science background, where proper measurement, and time, are very important, if nothing else, for safety reasons. PLCs and ladder logic fulfill these functions today, but cost wise, PLC's and PICs are a different ballpark, and tenfold redundancy is similarly cost prohibitive with them, unlike with PIC's. Automation can eliminate tremendous amount of backbreaking work, and make the word a more efficient, easier and safer place to live in. Microcontrollers seem like a Godsend in this regard. If one can only learn them. I'm not sure I'm smart enough to, in a sense, to learn everything, to "take charge of my destiny," and learn how to fix the things that can be fixed in my life around me, or improve the things that can be improved, but at least I can have a go at it, I can try. But I wandered off a bit from the topic.. back to Delays.

One of the most beautiful PIC instructions is the "nop", no operation. You never encounter it in any high level programming language where things are obfuscated and uncertain, but where ever you see it, it's a comfoting sign that things are running under full accountability of time. After all besides his own programmer's time and pay, a main resource a computer scientist has to budget is execution time of software, the other limit being memory. These three things have to be held in balance - programmer time, execution time, memory consumption. These days everything is focused on programmer time, with grave sacrifices on execution and memory. And this would be all well, since even today, it is the programmer time that's the only expense. However as programmed devices are becoming ubiquitous, and energy consumption important, the nanowatt power PIC's with low memory resources will be more than adequate for many functions. Such as Roomba's. But I'm drifting off topic again. Back to delays..

Listing of ledblink.asm:
;Tutorial 1.2 - Nigel Goodwin 2002 - initial template
;modified by me, author at kolomp.blogspot.net

LIST p=16F628 ;tell assembler what chip we are using
include "p16f628.inc" ;include the defaults for the chip
; processor p16f628
; __config 0x3D09 ;sets the configuration settings (oscillator type etc.)

cblock 0x20 ;start of general purpose registers
d1 ;used in delay routine
d2 ;used in delay routine
d3 ;used in delay routine
endc

org 0x0000 ;org sets the origin, 0x0000 for the 16F628,
;this is where the program starts running
movlw 0x07
movwf CMCON ;turn comparators off (make it like a 16F84)

bsf STATUS, RP0 ;select bank 1
movlw b'00000000' ;set PortB all outputs
movwf TRISB
movwf TRISA ;set PortA all outputs
bcf STATUS, RP0 ;select bank 0

Loop
movlw 0xff
movwf PORTA ;set all bits on
movwf PORTB
nop ;the nop's make up the time taken by the goto
nop
call Delay2s ;this waits for a while!


movlw 0x00
movwf PORTA
movwf PORTB ;set all bits off
call Delay50ms
goto Loop ;go back and do it again

; Delay = 0.05 seconds
; Clock frequency = 20 MHz
; Actual delay = 0.05 seconds = 250000 cycles
; Error = 0 %

Delay50ms ;249993 cycles
movlw 0x4E
movwf d1
movlw 0xC4
movwf d2
Dly50ms_0
decfsz d1, f
goto $+2
decfsz d2, f
goto Dly50ms_0
;3 cycles
goto $+1
nop
;4 cycles (including call)
retlw 0x00


; Delay = 2 seconds
; Clock frequency = 20 MHz

; Actual delay = 2 seconds = 10,000,000 cycles
; Error = 0 %

Delay2s ;9999995 cycles
movlw 0x5A ;90
movwf d1
movlw 0xCD ;205
movwf d2
movlw 0x16 ;22
movwf d3 ;6 cycles so far

Dly2s_0 decfsz d1, f ;first trip to zero: 90*7=630 cycles, 90th activates d2 to 204
goto $+2 ;subsequent trips to zero 7*256 (from 0-1=255 rollover, i.e 256-1)
decfsz d2, f ;first trip to zero activated by d1->0 to 204, total 204*256*7+90*7
goto $+2 ;subsquent trips to zero total 256*256*7
decfsz d3, f ;first trip to zero only trip to zero, activated by d2->0
goto Dly2s_0 ;total 21*256*256*7+204*256*7+90*7=9999990-1 cycles
;-1 comes from the final decfsz d3 skipping a goto

nop ;1 cycle
retlw 0x00 ;4 cycles (including call)

end


I hope you just scrolled through that and continue reading here. I apologize for the indentation. but the pre preformatted tag is not obeyed in this blog, and I keep losing the tabspaces. You might best off copying and pasting it into a text editor and hand formatting the comments and indentations. I basically went and followed the tutorial at http://www.winpicprog.co.uk/pic_tutorial.htm
and modified it to work for the PG4D I have. Had to change the config bits from 0x3D18 to 0x3D09 (by using Kcalc's scientific feature to convert between hex and binary and the 16f628's datasheet pdf) to update to external 20 MHz oscillator, as opposed to the 8 MHz internal oscillator the tutorial uses. The delay routines I used to scratch my head at, until I found the source code generator at http://www.golovchenko.org/cgi-bin/delay

The above sourcecode when compiled with
gpasm ledblink.asm
at a command prompt, outputs a ledblink.hex file, that looks like this
:020000040000FA
:1000000007309F0083160030860085008312FF3082
:1000100085008600000000001D200030850086005D
:10002000122007284E30A000C430A100A00B1928D0
:10003000A10B16281B28000000345A30A000CD3038
:10004000A1001630A200A00B2628A10B2828A20B85
:060050002328000000342B
:00000001FF
This hex file, containing machine instructions only with sourcecode comments stripped, can be directly burned onto a pic, or disassembled with gpdasm, but the decompiled stuff looks pretty haywire. There is nothing like the original sourcecode, full of comments, and full of the choice of variable notations of the original author.
To burn the hex file onto the pic, my Belkin F5U shows up as /dev/ttyUSB0, so, after flipping the switch on the DEV-00001, and disconnecting the power source, I burn it with the command
picprog --burn --device=pic16f628 -i ./ledblink.hex --jdm --pic-serial-port=/dev/ttyUSB0
The very bottom subroutine, Delay2s, is where I tried to understand how it works. The neat thing on a pic is that you can count time by simply counting the lines of instruction. To reiterate, each instruction in a PIC takes 4 clock cycles. A 4 MHz crystal gives 1us instruction time, and a 20MHz crystal, the one that comes with the DEV-00001 from Spark Fun, a 0.2 us instruction time. For a 2 second delay, we need 10 million instructions executed before proceeding to turn the LED back on.

At the delay2s label, that serves as a marker for a future goto instruction to come to, we start following the execution of instructions. movlw means move literal value to w. W is the working register, and the PIC is retardedly simplistic on having a single working register. Have I said PIC's also follow harvard architecture, as opposed to von Neumann, where code and data space are shared? This feature makes it even more secure, since buffer overflows of data don't turn into instructions. Though there are ways to circumvent such things and make data as if it were an instruction, at least it's a first line defense, an extra safety barrier.
Once the W register is filled with 0x5A hexadecimal number, (which, using Kcalc, turns out to be 90 decimal, or 1011010 binary), the next step is movwf d1, meaning move contents of the W working register to file d1. d1 was set up at the beginning of the source code to mean memory location byte register 20 (d2 is 21, and d3 is 22) (that you can find on the datasheet of the PIC16F628A), each of these d# values being able to hold a byte, or a value from 0 to 255. Since we're trying to iterate 10 million times, 256 is not enough, 256x256=65536 is not enough either, but 256*256*256=16,777,216 is enough to represent 10,000,000, so we need 3 bytes memory. The three values of 90, 205 and 22 were obtained by http://www.golovchenko.org/cgi-bin/delay

and the verification of how they work is explained in the comments below.
Once the 3 initial values are setup in the registers, we proceed. The next instruction, decfsz, meaning decrement file and skip if zero, is a branching, conditional instruction, similar to if..then constructs in high level languages. It takes 1 instruction time (4 clock cycles) to execute, except if the condition turns out to be true, when it takes 2 clock cycles. This is to compensate for not executing the next line, and still be able to count total instruction time by simply counting lines of code, and multiplying by the instruction time factors, independent of the conditions being true or false.
The block of code from Delay2s_0 contains 7 instructions for each iteration. From the datasheet instruction listing we can see that goto's take two instruction cycles to complete, and it's beautifully written with goto $+2, jumping ahead 2 lines in the execution. So when the code starts out, d1 is 90, decfsz brings it to 89, proceeding to next instruction goto $+2 gives 1+2 instruction times so far, then another goto, and a final goto gives a 1+2+2+2=7 instruction times, before decfsz brings d1 to 88. The process repeats itself until d1 ends up at 0. At this time, the following goto is skipped, and instead d2 is decremented from 205 to 204, and proceeding along, counting the instructions, we see that the total still stays at 7 when we arrive back to Delay2s_0. At this time, register d0 contains 0, and subtracting 1 from it rolls it under to 255, as if the contents were 256, and the carry bit flag is set. So d1 contained 90 only during the first countdown, but for subsequent countdowns it will count from 255 to 0. So while d1 is no 0, we keep repeating the prior steps, 1+2+2+2 instructions, d2 staying at 204, while d1 goes from 255, 254, 253, ... 5, 4, 3, 2, 1 to 0, and at this point the skip if 0 becomes true again, and now the decfsz d2 instruction is executed, bringing d2 to 203. This process repeats itself, 204 times, with a total number of instructions passed 204*256*7+90*7, when d2 becomes 0. Now the decfsz d3 is executed, and similarly, by the time d3 becomes 0 we executed 21*256*256*7+204*256*7+90*7=9999990-1 instructions. The -1 comes from the final decfsz d3 skipping a goto - previously this instruction took 1 cycle to decf, and 2 cycles to goto, with a total of 3 cycles, but now, decfsz, with a true condition becomes a 2 cycle instruction, and there is no goto, so we lost a cycle 3 vs. 2. To compensate for that, there is a single nop executed next. There is an equation one can come up with calculate those values, but one may forget how it exactly goes when a quick delay routine is needed. Therefore the easiest procedure is to multiply 256 as many times as you need it, counting how many register varibales you will need, then starting backwards, 1 minus the trial number such as 21(trial number)*256*256*7(2*number of variables +1)... to get you under your target value, then repeat new trial number, 204(trial)*256*7(2*number of variables+1) to still keep you under it, and the final tweak, with the doublecheck like we just did. After a few times of practice all this will be a breeze, assuming you've had the patience to go through all this step by step. Patience is pretty much the most important thing for math. If you only saw what kind of lengthy things Leonhard Euler did when coming up with some of his formulas, such as, if I remember, Sum(1/x^2)=pi^2/6. http://www.physicsforums.com/showthread.php?t=80591 Lots and lots of patience and superaccurate methodical hand calculations. Most people throw in the towel long before him. That's what made him different from the rest of the world. Edison said genius is 10% inspiration and 90% perspiration. Tesla used to ridicule him on that, how he would go ahead and diligently start inspecting each straw right away when looking for a needle in a haystack, instead of first trying to figure out lazy ways to eliminate half the haystack by some logical reasoning first, if possible. But nevertheless the statement stands.

There are many ways to do delays, as listed at http://www.piclist.com/tecHREF/microchip/delay/general.htm ,some of them relying on deep subsequent "call" within another "call" which uses up a stack space. Stack space is at premium, especially on PIC10 and PIC12 series.

PS: it seems something is not right with the __config 0x3D09, and when I just uncomment that line, the LED flashes right.

No comments:

Post a Comment