Jump to content

As of July 17, 2015, the LabJack forums here at forums.labjack.com are shut down. New registrations, topics, and replies are disabled. All forums are in a read-only state for archive purposes.

Please visit our current forums at labjack.com/forums to view and make new posts. To post on the current forums, use your labjack.com login account. Your old LabJack forums login credentials have been retired. There are no longer separate logins for labjack.com and LabJack forums.


Photo

Communication Failure


  • Please log in to reply
9 replies to this topic

#1 drelidan7

drelidan7
  • Members
  • 7 posts

Posted 11 January 2013 - 01:59 PM

Hello. I am having an issue with a seemingly random Communications Failure error that is happening intermittently. It tends to happen the first time the software using the LabJack is run daily, at the same point. It has happened at this point a couple of other times without the first-run condition. The sequence that occurs to cause this error is... Initialize LabJack (create new object, initialize I/Os to default values, start our class timers that are used for polling the device). Set one I/O pin to 1 and then to 0. Poll device for 300 ms to collect data. Stop polling device. Wait ~20-60 seconds. Initialize LabJack again (this time, we only start our class timers) Set one I/O pin to 1 and then to 0. [Communications failure happens here sometimes] Set one I/O pin to 1. Poll device for 20 seconds. [Communications failure happens here sometimes as well] --- Crash has always happened by this point on the first boot. After restarting the PC, the issue does not occur for about another day. Some challenges with debugging this issue include: The LabJack is attached to a machine that does not run explorer.exe (it's more or less a KIOSK application - we have access to explorer, but launch straight into our own application). The LabJack cannot be removed from the system (there's a complicated wiring scheme/harness that keeps the LabJack in place). Some disassembly is required to see the LEDs on the LabJack, and no one has observed the LEDs while this error has occurred. System Configuration: We are running Windows 7 Professional 32-bit. We are accessing the LabJack through C# and LJUDDotNet. Code: Initialize: [codebox] public void Init(double dIoUpdateRate = DefaultUpdateRate) { try { this.SetUpdateRate(dIoUpdateRate); // If the LabJack device is null there was an error somewhere //in our startup sequence if (eLabJackDevice == null || bEStopEnabled) { this.InitHW(); } this.InitTimer(); byCommStatus |= COMM_STREAMING_RUNNING; //EStopClear(); //should be redundant } catch (LabJackUDException e) { logger.Fatal("LabJack Exception occurred during Initialize method.", e); throw e; } finally { logger.Debug("LabJackHw: Exiting Init method."); } }[/codebox] EStopClear (Exception occurs at the first instance of GoOne): [codebox] /// <summary> /// Clear the E-Stop signal /// </summary> /// <exception cref="LabJackUDException"></exception> public void EStopClear() { logger.Debug("LabJackHw: Entering EStopClear Method."); try { // Set the E-Stop clear signal active logger.Debug("Adding request to set E-Stop clear signal to active."); LJUD.AddRequest(eLabJackDevice.ljhandle, LJUD.IO.PUT_DIGITAL_BIT, PIN_E_STOP_CLEAR_OUT, 0.0, 0, 0); //Execute the list of requests. logger.Debug("Executing LabJack requests."); LJUD.GoOne(eLabJackDevice.ljhandle); // Set the E-Stop clear signal nonactive logger.Debug("Adding request to set E-Stop clear signal to nonactive."); LJUD.AddRequest(eLabJackDevice.ljhandle, LJUD.IO.PUT_DIGITAL_BIT, PIN_E_STOP_CLEAR_OUT, 1.0, 0, 0); //Execute the list of requests. logger.Debug("Executing LabJack requests."); LJUD.GoOne(eLabJackDevice.ljhandle); bEStopEnabled = false; } catch (LabJackUDException e) { byCommStatus |= COMM_ERRORS_PRESENT; //if (logger.IsDebugEnabled) logger.Debug("Updated byCommStatus to: " + byCommStatus); logger.Fatal("Labjack Exception occurred during EStopClear method.", e); throw e; } finally { logger.Debug("LabJackHw: Exiting EStopClear Method."); } }[/codebox] Polling method (not sure where the exception occurs within this method): [codebox] /// <summary> /// LabJack Timer Event used to sample the Labjack I/O /// </summary> /// <param name="source">Unused.</param> private void LabJackUpdateTimedEvent(object source) { logger.Debug("LabJackHw: Entering LabJackUpdateTimedEvent Method."); //Exit if the eStop is enabled. if (bEStopEnabled) return; double dPedal1Home, dPedal2Home, dPendantStart, dPendantSlower, dPendantFaster, dEstopTripped, dEstopButton, dAnalog1, dAnalog2; //UInt32 dwBits = 0x00000000; byte byPedalHomeStatus, byPendantStatus, byEStopStatus; HWGlobals.LabJackDataPacket aPacket; try { lock (UpdateTimedEventLock) { dPedal1Home = 0.0; dPedal2Home = 0.0; dPendantStart = 0.0; dPendantSlower = 0.0; dPendantFaster = 0.0; dEstopTripped = 0.0; dEstopButton = 0.0; dAnalog1 = 0.0; dAnalog2 = 0.0; byPedalHomeStatus = 0x03; byPendantStatus = 0x0F; byEStopStatus = 0x03; aPacket = new HWGlobals.LabJackDataPacket(); // Add the requests to read all of the I/O from the LabJack logger.Debug("Adding LabJack requests: Get status of all Inputs"); LJUD.AddRequest(eLabJackDevice.ljhandle, LJUD.IO.GET_AIN, (U3.CHANNEL)PIN_AIN_STRAIN_0, 0, 0, 0); LJUD.AddRequest(eLabJackDevice.ljhandle, LJUD.IO.GET_AIN, (U3.CHANNEL)PIN_AIN_STRAIN_1, 0, 0, 0); LJUD.AddRequest(eLabJackDevice.ljhandle, LJUD.IO.GET_DIGITAL_BIT, (U3.CHANNEL)PIN_PEDAL1_HOME_IN, 0, 0, 0); LJUD.AddRequest(eLabJackDevice.ljhandle, LJUD.IO.GET_DIGITAL_BIT, (U3.CHANNEL)PIN_PEDAL2_HOME_IN, 0, 0, 0); LJUD.AddRequest(eLabJackDevice.ljhandle, LJUD.IO.GET_DIGITAL_BIT, (U3.CHANNEL)PIN_PENDANT_START_IN, 0, 0, 0); LJUD.AddRequest(eLabJackDevice.ljhandle, LJUD.IO.GET_DIGITAL_BIT, (U3.CHANNEL)PIN_PENDANT_SLOWER_IN, 0, 0, 0); LJUD.AddRequest(eLabJackDevice.ljhandle, LJUD.IO.GET_DIGITAL_BIT, (U3.CHANNEL)PIN_PENDANT_FASTER_IN, 0, 0, 0); LJUD.AddRequest(eLabJackDevice.ljhandle, LJUD.IO.GET_DIGITAL_BIT, (U3.CHANNEL)PIN_E_STOP_TRIPPED_IN, 0, 0, 0); LJUD.AddRequest(eLabJackDevice.ljhandle, LJUD.IO.GET_DIGITAL_BIT, (U3.CHANNEL)PIN_E_STOP_BUTTON_IN, 0, 0, 0); //LJUD.AddRequest(eLabJackDevice.ljhandle, LJUD.IO.GET_DIGITAL_PORT, (U3.CHANNEL)0, 0, 16, 0); //LJUD.AddRequest(eLabJackDevice.ljhandle, LJUD.IO.GET_DIGITAL_PORT, (U3.CHANNEL)0, 0, 16, 0); //LJUD.AddRequest(eLabJackDevice.ljhandle, LJUD.IO.GET_DIGITAL_PORT, (U3.CHANNEL)0, 0, 16, 0); //LJUD.AddRequest(eLabJackDevice.ljhandle, LJUD.IO.GET_DIGITAL_PORT, (U3.CHANNEL)16, 0, 7, 0); //Execute the list of requests. //logger.Debug("Executing LabJack requests."); LJUD.GoOne(eLabJackDevice.ljhandle); // Retrieve the results from the LabJack //logger.Debug("Obtaining results for: Get status of all Inputs"); LJUD.GetResult(eLabJackDevice.ljhandle, LJUD.IO.GET_AIN, (U3.CHANNEL)PIN_AIN_STRAIN_0, ref dAnalog1); LJUD.GetResult(eLabJackDevice.ljhandle, LJUD.IO.GET_AIN, (U3.CHANNEL)PIN_AIN_STRAIN_1, ref dAnalog2); LJUD.GetResult(eLabJackDevice.ljhandle, LJUD.IO.GET_DIGITAL_BIT, (U3.CHANNEL)PIN_PEDAL1_HOME_IN, ref dPedal1Home); LJUD.GetResult(eLabJackDevice.ljhandle, LJUD.IO.GET_DIGITAL_BIT, (U3.CHANNEL)PIN_PEDAL2_HOME_IN, ref dPedal2Home); LJUD.GetResult(eLabJackDevice.ljhandle, LJUD.IO.GET_DIGITAL_BIT, (U3.CHANNEL)PIN_PENDANT_START_IN, ref dPendantStart); LJUD.GetResult(eLabJackDevice.ljhandle, LJUD.IO.GET_DIGITAL_BIT, (U3.CHANNEL)PIN_PENDANT_SLOWER_IN, ref dPendantSlower); LJUD.GetResult(eLabJackDevice.ljhandle, LJUD.IO.GET_DIGITAL_BIT, (U3.CHANNEL)PIN_PENDANT_FASTER_IN, ref dPendantFaster); LJUD.GetResult(eLabJackDevice.ljhandle, LJUD.IO.GET_DIGITAL_BIT, (U3.CHANNEL)PIN_E_STOP_TRIPPED_IN, ref dEstopTripped); LJUD.GetResult(eLabJackDevice.ljhandle, LJUD.IO.GET_DIGITAL_BIT, (U3.CHANNEL)PIN_E_STOP_BUTTON_IN, ref dEstopButton); //LJUD.GetResult(eLabJackDevice.ljhandle, LJUD.IO.GET_DIGITAL_PORT, (U3.CHANNEL)0, ref dBitsLow); //LJUD.GetResult(eLabJackDevice.ljhandle, LJUD.IO.GET_DIGITAL_PORT, (U3.CHANNEL)16, ref dBitsHigh); // Setup all of the digital IO bits to be in 32-bit value //dwBits = (UInt32)dBitsHigh; //dwBits <<= 16; //dwBits |= (UInt32)dBitsLow; // Set the Stride status bits - a '1' is not pressed, a '0' is pressed if (dPedal1Home > 0.0) { byPedalHomeStatus &= 0xFE; } if (dPedal2Home > 0.0) { byPedalHomeStatus &= 0xFD; } // Set the Pendant status bits - a '1' is not pressed, a '0' is pressed if (dPendantStart > 0.0) { byPendantStatus &= 0xFE; } //if (((dwBits >> PIN_PENDANT_PAUSE_IN) & 0x01) == 0x01) { byPendantStatus &= 0xFD; } if (dPendantSlower > 0.0) { byPendantStatus &= 0xFB; } if (dPendantFaster > 0.0) { byPendantStatus &= 0xF7; } // Set the EStop status bits - a '1' is not pressed, a '0' is pressed if (dEstopTripped > 0.0) { byEStopStatus &= 0xFE; } if (dEstopButton > 0.0) { byEStopStatus &= 0xFD; } //if (logger.IsDebugEnabled) logger.Debug("Updated byCommStatus to: " + byCommStatus); aPacket.pedalForceOne = dAnalog1; aPacket.pedalForceTwo = dAnalog2; aPacket.strideStatus = byPedalHomeStatus; aPacket.pendantStatus = byPendantStatus; }//end lock UpdateTimedEventLock TraceMessage("Exiting UpdateTimedEventLock"); //byCommStatus |= COMM_STREAMING_RUNNING; this.ProcessEStopStatus(byEStopStatus); // Callback to the higher level application LabJackDataDelegate(aPacket); } catch (LabJackUDException e) { byCommStatus |= COMM_ERRORS_PRESENT; //if (logger.IsDebugEnabled) logger.Debug("Updated byCommStatus to: " + byCommStatus); string msg = "Labjack Exception occurred during LabJackUpdateTimedEvent method." + e.Message; logger.Fatal("Labjack Exception occurred during LabJackUpdateTimedEvent method.", e); DoEStopActions(msg); //throw e; } finally { //logger.Debug("LabJackHw: Exiting LabJackUpdateTimedEvent Method."); } }[/codebox]

#2 LabJack Support

LabJack Support
  • Admin
  • 8677 posts

Posted 14 January 2013 - 08:54 AM

The first thing that jumps out at me is "Crash has always happened by this point on the first boot". A repeatable problem like that is the easiest to look at. Can you make a simplified program running on your desk with a U3 that has this problem, and then we could try it here?

#3 drelidan7

drelidan7
  • Members
  • 7 posts

Posted 15 January 2013 - 02:38 PM

The first thing that jumps out at me is "Crash has always happened by this point on the first boot". A repeatable problem like that is the easiest to look at. Can you make a simplified program running on your desk with a U3 that has this problem, and then we could try it here?


So, fortunately, it appears that we found the First Boot error (and it was unrelated to the LabJack). Unfortunately, there is still another error that exists. The Communication Failure error occurs after the device has been left on overnight.

So we would just do the following steps (after ~16 hours of inactivity).

Initialize LabJack again (this time, we only start our class timers)
Set one I/O pin to 1 and then to 0. [Communications failure happens here sometimes]
Set one I/O pin to 1.
Poll device for 20 seconds. [Communications failure happens here sometimes as well]

If left on for long periods of time and not in use, is it required that we reset the LabJack?

Also, in response to the simplified program: It is possible, though it would take some time (and, well, permission by the project lead).

#4 LabJack Support

LabJack Support
  • Admin
  • 8677 posts

Posted 16 January 2013 - 12:37 PM

Read through some general information here:

http://labjack.com/s...cation-failures

You should not have to do anything special to have the U3 sit idle for a long time. Lets try to determine whether the problem is something with the U3, the USB host, your software, or the computer.

Is it just your software that does this? What if you try the test panel in LJControlPanel instead?

Is your software running overnight, but just not doing much? Or is all software closed overnight and the U3 totally idle?

What OS are you using? Does it make a difference if you try a different computer?

#5 drelidan7

drelidan7
  • Members
  • 7 posts

Posted 22 January 2013 - 07:20 AM

I just wanted to give you an update that this is resolved. It appears that the USB ports on the system go to sleep after a certain period of inactivity. Thanks for the resources and the timely response. :)

#6 LabJack Support

LabJack Support
  • Admin
  • 8677 posts

Posted 22 January 2013 - 07:43 AM

Interesting. What OS are you using? What are you doing to avoid the problem?

#7 drelidan7

drelidan7
  • Members
  • 7 posts

Posted 16 April 2013 - 09:22 AM

Sorry for reviving an old thread. The issue seemed to be caused by the USB 3.0 ports on the computer that we were using randomly going to sleep. We were using a ZOTAC AD10 with Windows 7, and they are notorious for USB 3.0 problems. The solution was to plug it into a USB 2.0 port. However, we ran into another communication failure for the first time in a while today (on a separate machine), so this workaround has not been flawless. Is there a standard way to recover from a communication failure with the LabJack? I was thinking of disposing the object and then recreating it to establish a connection again. Would that work? Are there common ways to reestablish connection/return to a stable state based on the different error codes? Thanks,

#8 LabJack Support

LabJack Support
  • Admin
  • 8677 posts

Posted 16 April 2013 - 02:25 PM

The app note linked in post #4 above has some general information. The watchdog is a common technique to help long-term unattended operation:

Set up the watchdog on the device to reset it if there is no communication for some time. For example, if your program constantly talks to the device once per second, you might set up the watchdog to reset after a 10 second timeout. This reset will likely be enough to signal the host that a device is connected. You can enable the watchdog using "config defaults" in LJControlPanel, but it is better to enable and disable it in your actual software (U12, U3, U6, UE9) so it is not active when the LabJack is sitting idle.


Next time the problem happens, see if you can resolve it without power-cycling the U3. Try just closing out all programs and then re-opening your software. If that does not resolve it, try:

Close all software, go to Windows Device Manager, and find the USB entry for the device. Right-click and "Disable", then right-click and "Enable". If that brings it back, it is a sure sign that the device was fine but the host had indeed suffered a problem.


If that works, you can then try to find a way to do that programmatically through your software.

#9 drelidan7

drelidan7
  • Members
  • 7 posts

Posted 03 June 2013 - 12:55 PM

Hello again. I've found a way to reproduce communication failures on a relatively consistent basis. This is good, since that means I have a way to test for how to solve the issue.

Closing out all programs and re-opening the software did not solve the issue.

Disabling the LabJack U3 device and then re-enabling it through the Windows Task Manager did resolve the issue.

Now, when it comes to the programmatic method of doing this, I am running into issues. I am using the source code found here:
http://stackoverflow...programatically

Using this code, I am able to disable/enable several devices, from mice to serial communication ports. When I try to disable the LabJack, the SetupDiChangeState returns an error code of 0xE000020B (which, for many other drivers signifies an Unknown Error). Are you aware of any LabJack specific circumstances that would cause an error when disabling the device?

Thanks,

#10 LabJack Support

LabJack Support
  • Admin
  • 8677 posts

Posted 03 June 2013 - 03:44 PM

Hello again. I've found a way to reproduce communication failures on a relatively consistent basis. This is good, since that means I have a way to test for how to solve the issue.

Closing out all programs and re-opening the software did not solve the issue.

Disabling the LabJack U3 device and then re-enabling it through the Windows Task Manager did resolve the issue.

Now, when it comes to the programmatic method of doing this, I am running into issues. I am using the source code found here:
http://stackoverflow...programatically

Using this code, I am able to disable/enable several devices, from mice to serial communication ports. When I try to disable the LabJack, the SetupDiChangeState returns an error code of 0xE000020B (which, for many other drivers signifies an Unknown Error). Are you aware of any LabJack specific circumstances that would cause an error when disabling the device?

Thanks,


I believe an error of 0xE000020B in that context (DIFx) means ERROR_NO_SUCH_DEVINST. That sounds like it could be something maybe with how the LabJack is being specified. It could be something with the GUID or VID/PID depending on how you are trying to do it.

How reproducible is this communication error? Is it something you can send us? We've ran into this kind of issue a few times (where it took a disable/enable) in the Device Manager to fix it, however not in a way we could reproduce.

We've looked at adding the SetupDiChangeState to the driver itself to make this issue easier to fix. If you want to send us the code you are using for SetupDiChangeState or something to reproduce the issue you can email it to [email protected]


0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users